我试图使用regex排除消除歧义页时,刮维基百科。我环顾四周寻找关于使用负面展望的技巧,但我似乎无法让它发挥作用。我想我忽略了它的使用的一些基本的东西,但到目前为止,我完全一无所知。谁能帮我指出正确的方向吗?(我不想使用y中的“消歧”,我试图掌握负面展望的工作原理。)谢谢。以下是代码:
list_links = ['/wiki/Oolong_(disambiguation)', '/wiki/File:Mi_Lan_Xiang_Oolong_Tea_cropped.jpg',
'/wiki/Taiwanese_tea', '/wiki/Tung-ting_tea',
'/wiki/Nantou_County', '/wiki/Taiwan', '/wiki/Dongfang_Meiren',
'/wiki/Alishan_National_Scenic_Area', '/wiki/Chiayi_County',
'/wiki/Dayuling', '/wiki/Baozhong_tea', '/wiki/Pinglin_Township']
def findString(string):
regex1 = r'(/wiki/)(_\($)(!?disambiguation)'
for x in list_links:
y = re.findall(regex1, x)
print(y)
findString(list_links)```发布于 2022-03-16 12:36:14
根据您的需要,您可以使用其中一个正则表达式。此外,为了尊重PEP,我在函数定义中添加了一些更改。
def remove_disambiguation_link(list_of_links):
regex = "(.*)\((!?disambiguation)\)"
# regex = "(/wiki/)(.*)\((!?disambiguation)\)"
# return [links for links in list_of_links if not re.search(regex, links)]
return list(filter(lambda link: not re.search(regex, link), list_of_links))
list_links = remove_disambiguation_link(list_links)
print(list_links)[
"/wiki/File:Mi_Lan_Xiang_Oolong_Tea_cropped.jpg",
"/wiki/Taiwanese_tea",
"/wiki/Tung-ting_tea",
"/wiki/Nantou_County",
"/wiki/Taiwan",
"/wiki/Dongfang_Meiren",
"/wiki/Alishan_National_Scenic_Area",
"/wiki/Chiayi_County",
"/wiki/Dayuling",
"/wiki/Baozhong_tea",
"/wiki/Pinglin_Township",
]发布于 2022-03-16 12:25:28
对于你的情况,最简单的解决方案就是不使用正则表达式.只要做这样的事情:
list_links = ['/wiki/Oolong_(disambiguation)', '/wiki/File:Mi_Lan_Xiang_Oolong_Tea_cropped.jpg',
'/wiki/Taiwanese_tea', '/wiki/Tung-ting_tea',
'/wiki/Nantou_County', '/wiki/Taiwan', '/wiki/Dongfang_Meiren',
'/wiki/Alishan_National_Scenic_Area', '/wiki/Chiayi_County',
'/wiki/Dayuling', '/wiki/Baozhong_tea', '/wiki/Pinglin_Township']
def findString(string):
regex1 = r'(/wiki/)(_\($)'
for x in string:
if 'disambiguation' in x:
continue # skip
y = re.findall(regex1, x)
print(y)
findString(list_links)发布于 2022-03-16 12:27:22
您不需要使用regex。您可以遍历list_links并检查您要查找的字符串,“消歧is in each item inlist_links`”。
list_links = ['/wiki/Oolong_(disambiguation)', '/wiki/File:Mi_Lan_Xiang_Oolong_Tea_cropped.jpg',
'/wiki/Taiwanese_tea', '/wiki/Tung-ting_tea',
'/wiki/Nantou_County', '/wiki/Taiwan', '/wiki/Dongfang_Meiren',
'/wiki/Alishan_National_Scenic_Area', '/wiki/Chiayi_County',
'/wiki/Dayuling', '/wiki/Baozhong_tea', '/wiki/Pinglin_Township']
to_find = 'disambiguation'
def findString(list_links):
for link in list_links:
if to_find in link:
# get indice of match
match_index = list_links.index(link)
# remove match from list
list_links.pop(match_index)
# print new list without 'disambiguation' items
print(list_links)
findString(list_links)https://stackoverflow.com/questions/71496858
复制相似问题