我有这样的条件:
我想要的输出如下:
如果有任何数字紧跟在signing bonus后面,则保留字符串的该部分并删除所有内容。见预期输出1& 2
( b)如果没有数字后面跟着signing bonus,我应该得到第一部分的刺。见预期输出3
预期输出
My Regex:
match1 = re.findall(r'(?<=\bSigning Bonus\b)\s*(?:\S+\b\s*){0,8}',value, re.I|re.M|re.DOTALL)处理输出1和输出2,但不能处理输出3.
我也是开放的解决方案,可以不需要正则!
发布于 2019-04-25 11:27:49
如果您可以使用re.sub,那么您可以使用这个正则表达式用空字符串替换匹配的文本,
^[^\d\n]*signing bonus\s*|\s*signing bonus[^\d\n]*$在前两种情况下,您打算在signing bonus之后捕获字符串,但在第三种情况下,您的预期字符串在signing bonus之前,因此需要使用交替的另一个正则表达式。
Python代码,
import re
arr = ['Your signing bonus is 123,000','This year signing bonus is bad. the signing bonus for this year is EUR 123,000','The bonus is 14,456, but signing bonus.']
for s in arr:
print(s, '-->', re.sub(r'^[^\d\n]*signing bonus\s*|\s*signing bonus[^\d\n]*$', '', s))指纹,
our signing bonus is 123,000 --> is 123,000
This year signing bonus is bad. the signing bonus for this year is EUR 123,000 --> for this year is EUR 123,000
The bonus is 14,456, but signing bonus. --> The bonus is 14,456, but发布于 2019-04-25 11:29:29
试试下面的代码。
s1 = "Your signing bonus is 123,000"
s2 = "This year signing bonus is bad. the signing bonus for this year is EUR 123,000"
s3 = "The bonus is 14,456, but signing bonus."
regex = '[0-9]'
import re
def format_string(s):
for subs in s.split('signing bonus'):
if re.findall(regex, subs):
print subs.strip()
format_string(s1)
format_string(s2)
format_string(s3)产出如下:
is 123,000
for this year is EUR 123,000
The bonus is 14,456, but发布于 2019-04-25 11:31:05
这将打印出你的答案:
statements = [
'Your signing bonus is 123,000',
'This year signing bonus is bad. the signing bonus for this year is EUR 123,000',
'The bonus is 14,456, but signing bonus.',
]
for statement in statements:
ans = statement.split('signing bonus')
if not ans:
print('')
continue
for i in range(len(ans) - 1, -1, -1):
for word in ans[i].split(' '):
try:
number = int(word.replace(',', ''))
print(ans[i].strip())
break
except:
pass输出:
is 123,000
for this year is EUR 123,000
The bonus is 14,456, buthttps://stackoverflow.com/questions/55847844
复制相似问题