我很难理解这个正则表达式的输出。我使用以下正则表达式查找文本中的日期:
^(?:(1[0-2]|0?[1-9])-(3[01]|[12][0-9]|0?[1-9])|(3[01]|[12][0-9]|0?[1-9])-(1[0-2]|0?[1-9]))-(?:[0-9]{2})?[0-9]{2}$它似乎与文本中的模式匹配正确,但我对返回值感到困惑。
对于此测试字符串:
TestString = "10-20-2015"它还这个:
[('10', '20', '', '')]如果我把()放在整个regex周围,就会得到如下结果:
[('10-20-2015', '10', '20', '', '')]我希望它只是返回完整的日期字符串,但它似乎是打破了结果,我不明白为什么。包装regex in ()将返回完整日期字符串,但它还返回4个额外值。
如何使它只匹配完整日期字符串,而不匹配字符串的小部分?
从我的控制台:
Python 3.4.2 (default, Oct 8 2014, 10:45:20)
[GCC 4.9.1] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import re
>>> pattern = "^(?:(1[0-2]|0?[1-9])-(3[01]|[12][0-9]|0?[1-9])|(3[01]|[12][0-9]|0?[1-9])-(1[0-2]|0?[1-9]))-(?:[0-9]{2})?[0-9]{2}$"
>>> TestString = "10-20-2015"
>>> re.findall(pattern, TestString, re.I)
[('10', '20', '', '')]
>>> pattern = "(^(?:(1[0-2]|0?[1-9])-(3[01]|[12][0-9]|0?[1-9])|(3[01]|[12][0-9]|0?[1-9])-(1[0-2]|0?[1-9]))-(?:[0-9]{2})?[0-9]{2}$)"
>>> re.findall(pattern, TestString, re.I)
[('10-20-2015', '10', '20', '', '')]
>>>
>>> TestString = "10--2015"
>>> re.findall(pattern, TestString, re.I)
[]
>>> pattern = "^(?:(1[0-2]|0?[1-9])-(3[01]|[12][0-9]|0?[1-9])|(3[01]|[12][0-9]|0?[1-9])-(1[0-2]|0?[1-9]))-(?:[0-9]{2})?[0-9]{2}$"
>>> re.findall(pattern, TestString, re.I)
[]根据答复,我的回答是:((?:(?:10-2|01-9)-(?:301|12|01-9)|(?:301|12|01-9)-(?:10-2|01-9))-(?:0-9{2})?0-9{2})
发布于 2016-01-27 04:49:15
每个()都是一个捕获的组,(1[0-2]|0?[1-9])捕获10,(3[01]|[12][0-9]|0?[1-9])捕获20,等等。当您在()中包围所有内容时,它会出现在其他()之前,并匹配所有内容。您可以忽略一个被捕获的组,即non-captured group,使用(?:)而不是()。
发布于 2021-11-08 14:35:55
我们可以使用一个最重要的re search()来完成这个任务。这个函数扫描一个字符串,寻找这个RE匹配的任何位置。
import re
text = "10-20-2015"
date_regex = '(\d{1,2})-(\d{1,2})-(\d{4})'
"""
\d in above pattern stands for numerical characters [0-9].
The numbers in curly brackets {} indicates the count of numbers permitted.
Parentheses/round brackets are used for capturing groups so that we can treat
multiple characters as a single unit.
"""
search_date = re.search(date_regex, text)
# for entire match
print(search_date.group())
# also print(search_date.group(0)) can be used
# for the first parenthesized subgroup
print(search_date.group(1))
# for the second parenthesized subgroup
print(search_date.group(2))
# for the third parenthesized subgroup
print(search_date.group(3))
# for a tuple of all matched subgroups
print(search_date.group(1, 2, 3))上面提到的每个打印语句的输出:
10-20-2015
10
20
2015
('10', '20', '2015')希望这个答案能消除你的疑虑:)
https://stackoverflow.com/questions/35028969
复制相似问题