有一根绳子
string= """"$deletedFields":["standardizedSkillUrn","standardizedSkill"],"entityUrn":"urn:li:fs_skill:(ACoAAAIv9SQBMzclPm3CZzL1QceTH5W0VrsdxbE,3)","name":"Finance","$type":"voyager.identity.profile.Skill"},{"$deletedFields":["standardizedSkillUrn","standardizedSkill"],"entityUrn":"urn:li:fs_skill:(ACoAAAIv9SQBMzclPm3CZzL1QceTH5W0VrsdxbE,22)","name":"Financial ["standardizedSkillUrn","standardizedSkill"],"entityUrn":"urn:li:fs_skill:(ACoAAAIv9SQBMzclPm3CZzL1QceTH5W0VrsdxbE,34)","name":"Due Diligence","name":"Strategy""""
我可以用什么正则表达式来检索“名称”之后的值:获得适当的扩展性、金融性和金融性
我试过了
match = re.compile(r'"name"\:(.\w+)') match.findall(string)
但它又回来了
['"Finance', '"Financial', '"Due', '"Financial', '"Strategy'],Due Diligence是分开的,我希望这两个词都是一个。
发布于 2017-12-29 20:52:16
regex没有检测到您的空白,因为/w只搜索非特殊字符。
"name"\:(.\w+\s*\w*)用一个额外的单词来解释任何可能的空格(不会用三个词,但在你的情况下会)
"name"\:(.\w+\s*\w*"?)在每篇文章的末尾都列出了"的报价,但没有得到财务上的报价。示例
编辑:“金融”的固定第二版
发布于 2017-12-29 20:56:57
我会使用非饥饿的.*?表达式,并附带一个尾随引号:
import re
string = """$deletedFields":["standardizedSkillUrn","standardizedSkill"],"entityUrn":"urn:li:fs_skill:(ACoAAAIv9SQBMzclPm3CZzL1QceTH5W0VrsdxbE,3)","name":"Finance","$type":"voyager.identity.profile.Skill"},{"$deletedFields":["standardizedSkillUrn","standardizedSkill"],"entityUrn":"urn:li:fs_skill:(ACoAAAIv9SQBMzclPm3CZzL1QceTH5W0VrsdxbE,22)","name":"Financial ["standardizedSkillUrn","standardizedSkill"],"entityUrn":"urn:li:fs_skill:(ACoAAAIv9SQBMzclPm3CZzL1QceTH5W0VrsdxbE,34)","name":"Due Diligence","name":"Strategy"""
# With the leading double quote
match = re.compile(r'"name"\:(".*?)["\[]')
a = match.findall(string)
print a
# Stripping out the leading double quote
match = re.compile(r'"name"\:"(.*?)["\[]')
b = match.findall(string)
print b最后的产出是:
['"Finance', '"Financial ', '"Due Diligence']
['Finance', 'Financial ', 'Due Diligence']https://stackoverflow.com/questions/48028258
复制相似问题