无法创建良好的正则表达式来捕获--c5eda821-H-和--c5eda821-Z-之间的完整数据
这个查询的正则表达式是
re.compile('--([a-f0-9]{8})-H-(.+?)--[a-f0-9]{8}', re.MULTILINE | re.DOTALL)
--c5eda821-F--
HTTP/1.1 200 OK
X-Powered-By: PHP/5.5.9-1ubuntu4.21
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Pragma: no-cache
X-XSS-Protection: 0
Vary: Accept-Encoding
Content-Encoding: gzip
X-Content-Type-Options: nosniff
X-Frame-Options: sameorigin
Content-Length: 1567
Keep-Alive: timeout=5, max=99
Connection: Keep-Alive
Content-Type: text/html
--c5eda821-E--
--c5eda821-H--
Message: Warning. String match "0" at RESPONSE_HEADERS:X-XSS-Protection. [file "/usr/share/modsecurity-crs/optional_rules/modsecurity_crs_55_application_defects.conf"] [line "141"] [id "981403"] [msg "AppDefect: IE8's XSS protection Filter is Disabled."] [data "X-XSS-Protection: 0"] [tag "WASCTC/WASC-15"] [tag "MISCONFIGURATION"] [tag "http://websecuritytool.codeplex.com/wikipage?title=Checks#internet-explorer-xss-filter-disabled"]
Apache-Handler: application/x-httpd-php
Stopwatch: 1501247328871413 10305 (- - -)
Stopwatch2: 1501247328871413 10305; combined=2942, p1=395, p2=2280, p3=34, p4=41, p5=147, sr=108, sw=45, l=0, gc=0
Response-Body-Transformed: Dechunked
Producer: ModSecurity for Apache/2.7.7 (http://www.modsecurity.org/); OWASP_CRS/2.2.8.
Server: Apache
WebApp-Info: "default" "59EFAF5D261B7D5BE14460C1BF3EE0A9" ""
Engine-Mode: "DETECTION_ONLY"
--c5eda821-Z--发布于 2017-07-29 02:06:55
这对我来说很有效:
>>> haystack = """--c5eda821-H-
Message: Warning. Match of "eq 1" against "&ARGS:CSRF_TOKEN" required. [file "/usr/share/modsecurity-crs/optional_rules/modsecurity_crs_43_csrf_protection.conf"] [line "31"] [id "981143"] [msg "CSRF Attack Detected - Missing CSRF Token."]
Message: Warning. Pattern match "(.*?)=(?i)(?!.*httponly.*)(.*$)" at RESPONSE_HEADERS:Set-Cookie. [file "/usr/share/modsecurity-crs/optional_rules/modsecurity_crs_55_application_defects.conf"] [line "83"] [id "981184"] [msg "AppDefect: Missing HttpOnly Cookie Flag for auth."] [tag "WASCTC/WASC-15"] [tag "MISCONFIGURATION"] [tag "http://websecuritytool.codeplex.com/wikipage?title=Checks#cookie-not-setting-httponly-flag"]
Apache-Handler: application/x-httpd-php
Stopwatch: 1501247328778702 7722 (- - -)
Stopwatch2: 1501247328778702 7722; combined=2901, p1=886, p2=1609, p3=54, p4=87, p5=213, sr=309, sw=52, l=0, gc=0
Response-Body-Transformed: Dechunked
Producer: ModSecurity for Apache/2.7.7 (http://www.modsecurity.org/);
OWASP_CRS/2.2.8.
Server: Apache
WebApp-Info: "default" "59EFAF5D261B7D5BE14460C1BF3EE0A9" ""
Engine-Mode: "DETECTION_ONLY"
--c5eda821-Z--"""
>>> print(re.search(r'--[\da-e]{8}-\w-(.+?)--[\da-e]{8}-\w--$', haystack, re.M|re.DOTALL).group(1))
Message: Warning. Match of "eq 1" against "&ARGS:CSRF_TOKEN" required. [file "/usr/share/modsecurity-crs/optional_rules/modsecurity_crs_43_csrf_protection.conf"] [line "31"] [id "981143"] [msg "CSRF Attack Detected - Missing CSRF Token."]
Message: Warning. Pattern match "(.*?)=(?i)(?!.*httponly.*)(.*$)" at RESPONSE_HEADERS:Set-Cookie. [file "/usr/share/modsecurity-crs/optional_rules/modsecurity_crs_55_application_defects.conf"] [line "83"] [id "981184"] [msg "AppDefect: Missing HttpOnly Cookie Flag for auth."] [tag "WASCTC/WASC-15"] [tag "MISCONFIGURATION"] [tag
"http://websecuritytool.codeplex.com/wikipage?title=Checks#cookie-not-setting-httponly-flag"]
Apache-Handler: application/x-httpd-php
Stopwatch: 1501247328778702 7722 (- - -)
Stopwatch2: 1501247328778702 7722; combined=2901, p1=886, p2=1609, p3=54, p4=87, p5=213, sr=309, sw=52, l=0, gc=0
Response-Body-Transformed: Dechunked
Producer: ModSecurity for Apache/2.7.7 (http://www.modsecurity.org/);
OWASP_CRS/2.2.8.
Server: Apache
WebApp-Info: "default" "59EFAF5D261B7D5BE14460C1BF3EE0A9" ""
Engine-Mode: "DETECTION_ONLY"您描述的错误消息是因为当不存在匹配项时,re.search返回None,并且None没有groups属性。
为了防止这种异常,你应该测试方法的返回值,以检查是否有任何匹配:
regex = re.compile(r'--[\da-e]{8}-\w-(.+?)--[\da-e]{8}-\w--$', re.M|re.DOTALL)
match = regex.search(haystack)
if match:
print match.group(1)
else:
print "No match"更新
你是对的,在只有一个长字符串的情况下,我有更多的,以同样的方式。,
。但是,我只想要那些标签之间的内容。- dumbo
试试这个:
>>> regex = re.compile(r'--[\da-e]{8}-\w--(.+?)--[\da-e]{8}-\w--', re.M|re.DOTALL)
>>> for i, match in enumerate(regex.findall(haystack)):
... print('{:02d}-> {}...'.format(i, match[:15].strip()))
00-> HTTP/1.1 200 O...
01-> Message: Warni...findall方法将返回匹配列表。如果您只想要最后一个匹配:
>>> matches = regex.findall(haystack)
>>> print(matches[-1])或者仅仅是第二个:
>>> print(matches[1])发布于 2017-07-29 02:10:15
您可以使用两个re.sub调用来删除不需要的部分。如果这在您的情况下有效,那么正则表达式通常会更简单。
>>> import re
>>> text = open('temp.txt').read()
>>> re.sub(r'--c5eda821-Z--', '', re.sub(r'--c5eda821-H-', '', text))
'\n\nMessage: Warning. Match of "eq 1" against "&ARGS:CSRF_TOKEN" required. [file "/usr/share/modsecurity-crs/optional_rules/modsecurity_crs_43_csrf_protection.conf"] [line "31"] [id "981143"] [msg "CSRF Attack Detected - Missing CSRF Token."]\nMessage: Warning. Pattern match "(.*?)=(?i)(?!.*httponly.*)(.*$)" at RESPONSE_HEADERS:Set-Cookie. [file "/usr/share/modsecurity-crs/optional_rules/modsecurity_crs_55_application_defects.conf"] [line "83"] [id "981184"] [msg "AppDefect: Missing HttpOnly Cookie Flag for auth."] [tag "WASCTC/WASC-15"] [tag "MISCONFIGURATION"] [tag "http://websecuritytool.codeplex.com/wikipage?title=Checks#cookie-not-setting-httponly-flag"]\nApache-Handler: application/x-httpd-php\nStopwatch: 1501247328778702 7722 (- - -)\nStopwatch2: 1501247328778702 7722; combined=2901, p1=886, p2=1609, p3=54, p4=87, p5=213, sr=309, sw=52, l=0, gc=0\nResponse-Body-Transformed: Dechunked\nProducer: ModSecurity for Apache/2.7.7 (http://www.modsecurity.org/); OWASP_CRS/2.2.8.\nServer: Apache\nWebApp-Info: "default" "59EFAF5D261B7D5BE14460C1BF3EE0A9" ""\nEngine-Mode: "DETECTION_ONLY"\n\n\n'Edit in response to comment:那么我建议使用这样的方法。
>>> import re
>>> with open('temp.txt') as text:
... for line in text.readlines():
... if re.match(r'--c5[a-z]{3}821-[A-Z]-', line.strip()):
... continue
... else:
... print(line.strip())https://stackoverflow.com/questions/45379491
复制相似问题