因此,我的任务是将字符串转换为dict (必须使用regex)。我已经做了一个findall来分离每个元素,但是不知道如何将它组合在一起。
我有以下代码:
import re
def edata():
with open("employeedata.txt", "r") as file:
employeedata = file.read()
IP_field = re.findall(r"\d+[.]\d+[.]\d+[.]\d+", employeedata)
username_field = re.findall (r"[a-z]+\d+|- -", employeedata)
date_field = re.findall (r"\d+\/[A-Z][a-z][0-9]+\/\d\d\d\d:\d+:\d+:\d+ -\d+", employeedata)
type_field = re.findall (r'"(.*)?"', employeedata)
Fields = ["IP","username","date","type"]
Fields2 = IP_field, username_field, date_field, type_field
dictionary = dict(zip(Fields,Fields2))
return dictionary
print(edata())当前产出:
{ "IP": ["190.912.120.151", "190.912.120.151"], "username": ["skynet10001", "skynet10001"] etc }预期产出:
[{ "IP": "190.912.120.151", "username": "skynet10001" etc },
{ "IP": "190.912.120.151", "username": "skynet10001" etc }]发布于 2021-12-12 21:05:34
另一个使用您已经构造的字典的解决方案。此代码使用列表理解和zip函数从现有的dictionary变量生成字典列表。
import re
def edata():
with open("employeedata.txt", "r") as file:
employeedata = file.read()
IP_field = re.findall(r"\d+[.]\d+[.]\d+[.]\d+", employeedata)
username_field = re.findall (r"[a-z]+\d+|- -", employeedata)
date_field = re.findall (r"\[(.*?)\]", employeedata) ## changed your regex for the date field
type_field = re.findall (r'"(.*)?"', employeedata)
Fields = ["IP","username","date","type"]
Fields2 = IP_field, username_field, date_field, type_field
dictionary = dict(zip(Fields,Fields2))
result_dictionary = [dict(zip(dictionary, i)) for i in zip(*dictionary.values())] ## convert to list of dictionaries
return result_dictionary
print(edata())发布于 2021-12-12 20:52:08
您可以使用
import re
rx = re.compile(r'^(?P<IP>\d+(?:\.\d+){3})\s+\S+\s+(?P<Username>[a-z]+\d+)\s+\[(?P<Date>[^][]+)]\s+"(?P<Type>[^"]*)"')
def edata():
results = []
with open("downloads/employeedata.txt", "r") as file:
for line in file:
match = rx.search(line)
if match:
results.append(match.groupdict())
return results
print(edata())见在线Python演示。对于file = ['190.912.120.151 - skynet10001 [19/Jan/2012] "Temp"', '221.143.119.260 - terminator002 [16/Feb/2021] "Temp 2"']输入,输出将是:
[{'IP': '190.912.120.151', 'Username': 'skynet10001', 'Date': '19/Jan/2012', 'Type': 'Temp'}, {'IP': '221.143.119.260', 'Username': 'terminator002', 'Date': '16/Feb/2021', 'Type': 'Temp 2'}]判断力
^(?P<IP>\d+(?:\.\d+){3})\s+\S+\s+(?P<Username>[a-z]+\d+)\s+\[(?P<Date>[^][]+)]\s+"(?P<Type>[^"]*)"见regex演示。详细信息
^ -字符串的开始(?P<IP>\d+(?:\.\d+){3}) -分组"IP":一个或多个数字,然后出现三个.和一个或多个数字\s+\S+\s+ -两端包含一个或多个空白字符的非空格字符。(?P<Username>[a-z]+\d+) -组“用户名”:一个或多个小写字母,然后是一个或多个数字\s+ -一个或多个空白空间\[ -a [ char(?P<Date>[^][]+) -组"Date":[和]以外的一个或多个字符]\s+" -一个]字符,一个或多个空白空间,"(?P<Type>[^"]*) -组“类型”:除"以外的零或多个字符" -a " char.https://stackoverflow.com/questions/70327192
复制相似问题