我现在有一个函数,它产生一个词和它发生在其中的句子。此时,函数只从术语列表中检索第一个匹配项。我希望能够检索所有的匹配,而不仅仅是第一次。
例如,list_of_matches = ["heart attack", "cardiovascular", "hypoxia"]和一个句子将是text_list = ["A heart attack is a result of cardiovascular...", "Chronic intermittent hypoxia is the..."]
理想的产出是:
['heart attack', 'a heart attack is a result of cardiovascular...'],
['cardiovascular', 'a heart attack is a result of cardiovascular...'],
['hypoxia', 'chronic intermittent hypoxia is the...']# this is the current function
def find_word(list_of_matches, line):
for words in list_of_matches:
if any([words in line]):
return words, line
# returns list of 'term, matched string'
key_vals = [list(find_word(list_of_matches, line.lower())) for line in text_list if
find_word(list_of_matches, line.lower()) != None]
# output is currently
['heart attack', 'a heart attack is a result of cardiovascular...'],
['hypoxia', 'chronic intermittent hypoxia is the...']发布于 2021-12-13 16:31:32
你会想在这里使用regex。
import re
def find_all_matches(words_to_search, text):
matches = []
for word in words_to_search:
matched_text = re.search(word, text).group()
matches.append(matched_text)
return [matches, text]请注意,这将返回所有匹配的嵌套列表。
https://stackoverflow.com/questions/70337701
复制相似问题