我在python 2.7上遇到了一个解析问题,让我解释一下:
我正在解析来自incapsula API的事件。其目标是使它们在excel表中可读,以便生成统计数据和图表。
在签名字段中,可以读取事件/攻击的类型和数字。这个数字包括攻击次数,所以我决定在“signature=”字段之后将每一行的攻击次数乘以相应的攻击数之和。
就像这张照片:
visit_id=324001290181618591, src_country=Ukraine, event_timestamp=1484493309742, src_ip=91.223.133.30, dest_name=www.xxx.com, dest_id=1551642, signature={api.threats.sql_injection=3}
visit_id=324001290181618591, src_country=Ukraine, event_timestamp=1484493309742, src_ip=91.223.133.30, dest_name=www.xxx.com, dest_id=1551642, signature={api.threats.sql_injection=3}
visit_id=324001290181618591, src_country=Ukraine, event_timestamp=1484493309742, src_ip=91.223.133.30, dest_name=www.xxx.com, dest_id=1551642, signature={api.threats.sql_injection=3}到目前为止,一切都如期而至,我得到了正确的攻击数。
但
对于一些罕见的事件,它们是签名字段中的多个值,如以下捕获:
visit_id=324001290181618591, src_country=Ukraine, event_timestamp=1484493309742, src_ip=91.223.133.30, dest_name=www.xxx.com, dest_id=1551642, signature={api.threats.sql_injection=3, api.threats.bot_access_control=1, api.threats.illegal_resource_access=1, api.threats.cross_site_scripting=1,}
visit_id=324001290181618591, src_country=Ukraine, event_timestamp=1484493309742, src_ip=91.223.133.30, dest_name=www.xxx.com, dest_id=1551642, signature={api.threats.sql_injection=3, api.threats.bot_access_control=1, api.threats.illegal_resource_access=1, api.threats.cross_site_scripting=1,}
visit_id=324001290181618591, src_country=Ukraine, event_timestamp=1484493309742, src_ip=91.223.133.30, dest_name=www.xxx.com, dest_id=1551642, signature={api.threats.sql_injection=3, api.threats.bot_access_control=1, api.threats.illegal_resource_access=1, api.threats.cross_site_scripting=1,}
visit_id=324001290181618591, src_country=Ukraine, event_timestamp=1484493309742, src_ip=91.223.133.30, dest_name=www.xxx.com, dest_id=1551642, signature={api.threats.sql_injection=3, api.threats.bot_access_control=1, api.threats.illegal_resource_access=1, api.threats.cross_site_scripting=1,}
visit_id=324001290181618591, src_country=Ukraine, event_timestamp=1484493309742, src_ip=91.223.133.30, dest_name=www.xxx.com, dest_id=1551642, signature={api.threats.sql_injection=3, api.threats.bot_access_control=1, api.threats.illegal_resource_access=1, api.threats.cross_site_scripting=1,}
visit_id=324001290181618591, src_country=Ukraine, event_timestamp=1484493309742, src_ip=91.223.133.30, dest_name=www.xxx.com, dest_id=1551642, signature={api.threats.sql_injection=3, api.threats.bot_access_control=1, api.threats.illegal_resource_access=1, api.threats.cross_site_scripting=1,}
visit_id=86001060468746692, src_country=Netherlands, event_timestamp=1483867285054, src_ip=178.22.232.53, dest_name=www.yyy.com, dest_id=1551642, signature={api.threats.bot_access_control=1, api.threats.illegal_resource_access=3,}
visit_id=86001060468746692, src_country=Netherlands, event_timestamp=1483867285054, src_ip=178.22.232.53, dest_name=www.yyy.com, dest_id=1551642, signature={api.threats.bot_access_control=1, api.threats.illegal_resource_access=3,}
visit_id=86001060468746692, src_country=Netherlands, event_timestamp=1483867285054, src_ip=178.22.232.53, dest_name=www.yyy.com, dest_id=1551642, signature={api.threats.bot_access_control=1, api.threats.illegal_resource_access=3,}
visit_id=86001060468746692, src_country=Netherlands, event_timestamp=1483867285054, src_ip=178.22.232.53, dest_name=www.yyy.com, dest_id=1551642, signature={api.threats.bot_access_control=1, api.threats.illegal_resource_access=3,}对于那些罕见的行,我仍然得到了正确的攻击计数,但我想从以下几个方面来安排签名字段:
signature={api.threats.sql_injection=3, api.threats.bot_access_control=1, api.threats.illegal_resource_access=1, api.threats.cross_site_scripting=1,}
signature={api.threats.sql_injection=3, api.threats.bot_access_control=1, api.threats.illegal_resource_access=1, api.threats.cross_site_scripting=1,}
signature={api.threats.sql_injection=3, api.threats.bot_access_control=1, api.threats.illegal_resource_access=1, api.threats.cross_site_scripting=1,}
signature={api.threats.sql_injection=3, api.threats.bot_access_control=1, api.threats.illegal_resource_access=1, api.threats.cross_site_scripting=1,}
signature={api.threats.sql_injection=3, api.threats.bot_access_control=1, api.threats.illegal_resource_access=1, api.threats.cross_site_scripting=1,}
signature={api.threats.sql_injection=3, api.threats.bot_access_control=1, api.threats.illegal_resource_access=1, api.threats.cross_site_scripting=1,}
signature={api.threats.bot_access_control=1, api.threats.illegal_resource_access=3,}
signature={api.threats.bot_access_control=1, api.threats.illegal_resource_access=3,}
signature={api.threats.bot_access_control=1, api.threats.illegal_resource_access=3,}
signature={api.threats.bot_access_control=1, api.threats.illegal_resource_access=3,}对此:
signature={api.threats.sql_injection}
signature={api.threats.sql_injection}
signature={api.threats.sql_injection}
signature={api.threats.bot_access_control}
signature={api.threats.illegal_resource_access}
signature={api.threats.cross_site_scripting}
signature={api.threats.bot_access_control}
signature={api.threats.illegal_resource_access}
signature={api.threats.illegal_resource_access}
signature={api.threats.illegal_resource_access}(前六行为第一次重复事件6次(3+1+1+1 =6),后4行为第二次重复事件4次(1+3=4)
我目前的源代码:
#count the number of attack per line
f = open('monthlyLogShort.txt','r')
g = open("count.txt", 'w')
kensu = f.readlines()
f.close()
for line in kensu:
st = line.find('signature=')
end = line.find('}')
unprecise = line[st:end+1]
#count = int(re.search(r'\d+', unprecise).group())
count = sum(map(int,re.findall(r'[0-9]+', unprecise)))
print >> g, count
g.close()
#replicate lines according to the number of attack
h = open('flog.txt','w')
with open("monthlyLogShort.txt") as textfile1, open("count.txt") as textfile2:
for x, y in izip(textfile1, textfile2):
x = x.strip()
y = y.strip()
print >> h, x * int(y)
h.close()发布于 2017-01-29 22:38:47
如果我正确地读取了您的需求,您将尝试为每次发生的威胁发出一行,同时保留记录的其余部分。此解决方案不直接输出计数,而是将数据转换为每行都有一个威胁。
代码:
sig_str = 'signature={'
for line in kensu:
record, signature = line.split(sig_str)
threats = signature.split('}')[0]
for counts in threats.split(','):
if '=' in counts:
threat, count = tuple(counts.split('='))
for i in range(int(count)):
print '%s%s%s}' % (record, sig_str, threat.strip())样本数据:
kensu = [x.strip() for x in """
record=0, signature={api.threats.sql_injection=1}
record=1, signature={api.threats.sql_injection=3, api.threats.bot_access_control=1, api.threats.illegal_resource_access=1, api.threats.cross_site_scripting=1,}
record=2, signature={api.threats.bot_access_control=1, api.threats.illegal_resource_access=3,}
""".split('\n')[1:-1]]输出:
record=0, signature={api.threats.sql_injection}
record=1, signature={api.threats.sql_injection}
record=1, signature={api.threats.sql_injection}
record=1, signature={api.threats.sql_injection}
record=1, signature={api.threats.bot_access_control}
record=1, signature={api.threats.illegal_resource_access}
record=1, signature={api.threats.cross_site_scripting}
record=2, signature={api.threats.bot_access_control}
record=2, signature={api.threats.illegal_resource_access}
record=2, signature={api.threats.illegal_resource_access}
record=2, signature={api.threats.illegal_resource_access}https://stackoverflow.com/questions/41919948
复制相似问题