给定一个日志字符串数组:
log = [
'[WARNING] 403 Forbidden: No token in request parameters',
'[ERROR] 500 Server Error: int is not subscription',
'[INFO] 200 OK: Login Successful',
'[INFO] 200 OK: User sent a message',
'[ERROR] 500 Server Error: int is not subscription'
]我试图更好地使用python中的字典,并希望遍历这个数组并打印如下所示:
{'WARNING': {'403': {'Forbidden': {'No token in request parameters': 1}}},
'ERROR': {'500': {'Server Error': {'int is not subscriptable': 2}}},
'INFO': {'200': {'OK': {'Login Successful': 1, 'User sent a message': 1}}}}本质上,我想返回一个字典,其中的日志统计信息与上面的格式类似。我开始写我的方法,到目前为止我写了如下:
def logInfo(logs):
dct = {}
for log in logs:
log = log.strip().split()
if log[2] == "Server":
log[2] = "Server Error:"
log.remove(log[3])
#print(log)
joined = " ".join(log[3:])
if log[0] not in dct:
log[0] = log[0].strip('[').strip(']')
dct[log[0]] = {}
if log[1] not in dct[log[0]]:
dct[log[0]][log[1]] = {}
if log[2] not in dct[log[0]][log[1]]:
dct[log[0]][log[1]][log[2]] = {}
if joined not in dct:
dct[log[0]][log[1]][log[2]][joined] = 1
else:
dct[log[0]][log[1]][log[2]][joined] += 1
else:
dct[joined].append(joined)
print(dct)它会打印以下内容:
{'WARNING': {'403': {'Forbidden:': {'No token in request parameters': 1}}}, 'ERROR': {'500': {'Server Error:': {'int is not subscription': 1}}}, 'INFO': {'200': {'OK:': {'User sent a message': 1}}}}这个方法本身也很长,有人能帮我或者暗示我一个更熟练的方法来处理这个问题吗?
发布于 2021-07-13 02:10:53
我浏览了一下您的code.found修复程序,它运行得很好。
if统一在同一个level.because上,如果不存在,则在键下给它一个空的dict,所以接下来如果有一个父键时会很好地工作。strip('[').strip(']')之前测试log[0] not in dct,所以您将始终了解以前的数据,我修复它并将其指向代码joined not in dct,您应该在dct[log[0]][log[1]][log[2]]中测试它,我修复它并将它指向代码H 211<>F 212下面。def logInfo(logs):
dct = {}
for log in logs:
log = log.strip().split()
if log[2] == "Server":
log[2] = "Server Error:"
log.remove(log[3])
#print(log)
joined = " ".join(log[3:])
log[0] = log[0].strip('[').strip(']')
if log[0] not in dct:
# this line should move to before in dct test
# log[0] = log[0].strip('[').strip(']')
dct[log[0]] = {}
if log[1] not in dct[log[0]]:
dct[log[0]][log[1]] = {}
if log[2] not in dct[log[0]][log[1]]:
dct[log[0]][log[1]][log[2]] = {}
# I did not know why test joined in the root dct
# if joined not in dct:
if joined not in dct[log[0]][log[1]][log[2]]:
dct[log[0]][log[1]][log[2]][joined] = 1
else:
dct[log[0]][log[1]][log[2]][joined] += 1
print(dct)发布于 2021-07-13 01:31:10
您可以使用re.findall和collections.defaultdict
import re, collections
r = collections.defaultdict(dict)
log = ['[WARNING] 403 Forbidden: No token in request parameters', '[ERROR] 500 Server Error: int is not subscription', '[INFO] 200 OK: Login Successful', '[INFO] 200 OK: User sent a message', '[ERROR] 500 Server Error: int is not subscription']
for i in log:
a, b, c, d = map(str.strip, re.findall('(?<=\[)\w+(?=\])|(?<=\]\s)\d+|(?<=\d\s)[\w\s]+(?=:)|(?<=:)[\w+\s]+$', i))
if b not in r[a]:
r[a][b] = collections.defaultdict(dict)
if c not in r[a][b]:
r[a][b][c] = collections.defaultdict(int)
r[a][b][c][d] += 1输出:
defaultdict(<class 'dict'>, {'WARNING': {'403': defaultdict(<class 'dict'>, {'Forbidden': defaultdict(<class 'int'>, {'No token in request parameters': 1})})}, 'ERROR': {'500': defaultdict(<class 'dict'>, {'Server Error': defaultdict(<class 'int'>, {'int is not subscription': 2})})}, 'INFO': {'200': defaultdict(<class 'dict'>, {'OK': defaultdict(<class 'int'>, {'Login Successful': 1, 'User sent a message': 1})})}})如果您只需要纯字典,则可以使用递归转换collections.defaultdict of collections.defaultdict。
def to_dict(d):
return {a:to_dict(b) if not isinstance(b, int) else b for a, b in d.items()}
print(to_dict(r))输出:
{'WARNING': {'403': {'Forbidden': {'No token in request parameters': 1}}},
'ERROR': {'500': {'Server Error': {'int is not subscription': 2}}},
'INFO': {'200': {'OK': {'Login Successful': 1, 'User sent a message': 1}}}}https://stackoverflow.com/questions/68355362
复制相似问题