我有这个带有一些日志的.txt *我试图将值"-“分配给"user_name”,以防文件没有用户名。但是,在这些情况下,输出dict显示了"user_name":"“。你知道我做错了什么吗?我也尝试过使用item.groupdict"user_name“= "-",但显然这不是问题的根源.
import re
with open("logdata.txt", "r") as file:
logdata = file.read()
logs = []
pattern = """
(?P<host>[\d.]+)[-\s]+
(?P<user_name>\w*)\s+
\[(?P<time>[^][]+)\]\s+
"(?P<request>[^"]+)"
"""
for item in re.finditer(pattern,logdata,re.VERBOSE):
**if item.groupdict("user_name") == None:
item["user_name"] = '-'**
logs.append(item.groupdict())下面是txt的示例:
71.172.239.195 - dooley1853 [21/Jun/2019:15:45:32 -0700] "PUT /cutting-edge HTTP/2.0" 406 24498
180.95.121.94 - mohr6893 [21/Jun/2019:15:45:34 -0700] "PATCH /extensible/reinvent HTTP/1.1" 201 27330
144.23.247.108 - auer7552 [21/Jun/2019:15:45:35 -0700] "POST /extensible/infrastructures/one-to-one/enterprise HTTP/1.1" 100 22921发布于 2020-11-02 04:06:12
您可以将item.groupdict()保存在变量中,并在将其附加到logs之前修改其内容。
for item in re.finditer(pattern, logdata, re.VERBOSE):
params = item.groupdict()
if params["user_name"] == "":
params["user_name"] = "-"
logs.append(params)或者,您可以更改正则表达式,使第二个组为可选组(?),但只匹配至少一个字符(\w+)的单词,然后使用default parameter of item.groupdict()。
import re
with open("logdata.txt", "r") as file:
logdata = file.read()
logs = []
pattern = """
(?P<host>[\d.]+)[-\s]+
(?P<user_name>\w+)?\s+
\[(?P<time>[^][]+)\]\s+
"(?P<request>[^"]+)"
"""
for item in re.finditer(pattern, logdata, re.VERBOSE):
logs.append(item.groupdict(default="-"))发布于 2021-03-08 23:22:43
我用了一种简单的方法
\S+至少一次都不是一个空间
import re
with open("logdata.txt", "r") as file:
logdata = file.read()
logs = []
pattern = """
(?P<host>[\d.]+)[-\s]+
(?P<user_name>\S+)\s+
\[(?P<time>[^][]+)\]\s+
"(?P<request>[^"]+)"
"""
for item in re.finditer(pattern,logdata,re.VERBOSE):
logs.append(item.groupdict())效果很好。我想我有点晚了。
https://stackoverflow.com/questions/64639479
复制相似问题