这是我上一次question的后续,我试图用另一个列表中的字符串替换列表中的字符串。
import numpy as np
from difflib import SequenceMatcher
from pprint import pprint
def similar(a, to_match):
percent_similarity = [SequenceMatcher(None, a, b).ratio() for b in to_match]
max_value_index = [i for i, j in enumerate(percent_similarity) if j == max(percent_similarity)][0]
map = [to_match[max_value_index] if max(percent_similarity) > 0.9 else a][0]
return map
if __name__ == '__main__':
strlist = ['D-saturn 6-pluto', np.nan, 'D-astroid 3-cyclone', 'DL-astroid 3-cyclone', 'DL-astroid', 'D-comment', 'literal']
to_match = ['saturn 6-pluto', 'pluto', 'astroid 3-cyclone', 'D-comment', 'D-astroid']
for item in strlist:
map = [similar(item, to_match) for item in strlist]
pprint(map)预期产出:
['saturn 6-pluto', np.nan, 'astroid 3-cyclone', 'astroid 3-cyclone', 'D-astroid', 'D-comment', 'literal']如果在np.nan中没有strlist,代码就能工作。我想检查一个字符串是否为nan,如果它存在,则返回nan。但是,我不知道如何在列表理解elif中使用map = [to_match[max_value_index] if max(percent_similarity) > 0.9 else a][0]语句
有人能帮我吗?
发布于 2019-08-01 10:42:25
您可以在另一个map函数中编写if else。
map = [similar(item, to_match) if isinstance(item, str) else item for item in strlist]发布于 2019-08-01 10:30:10
编辑:
那么,如果similar函数的类型不是字符串,那么如何更改其返回项本身呢?
def similar(a, to_match):
if type(a) is not str:
return a
percent_similarity = [SequenceMatcher(None, a, b).ratio() for b in to_match]
max_value_index = [i for i, j in enumerate(percent_similarity) if j == max(percent_similarity)][0]
ret = [to_match[max_value_index] if max(percent_similarity) > 0.9 else a][0]
return ret在for-循环中处理strlist之前,可以通过
strlist = [s for s in strlist if type(s) is str]https://stackoverflow.com/questions/57306830
复制相似问题