我有一个文本文件,其中只有35个字符串,我想找出文本file.How中最相关的字符串,我能实现BM25F、VSM或POS吗?
e.g
Panoramio Bahawalpur
... - Bahawalpur - Picture of Bahawalpur, Punjab Province - TripAdvisor
... Minister Syed Yousaf Raza Gillani\u00e2\u20ac\u2122s short visit to
Bahawalpur
Bahawalpur Station Pictures - Pakistan in Photos
Noor Mahal Station , Bahawalpur Railway Station | Noor Mahal the italian style palac ...
Bahawalpur Railway Pakistan
Nur Mehal, Bahawalpur 给定输入是Bahawalpur火车站
如何找到最合适的/相关的字符串?
发布于 2017-06-17 09:06:27
这是一个非常简单的任务。
from difflib import SequenceMatcher它将返回字符串匹配的百分比。
def similar(a, b):
return SequenceMatcher(None, a, b).ratio()
str = "This is hello-hi image"
print "The score of relevancy is :", similar("Hello",str) * 100 ,""您可以根据需要更改结果。谢谢
https://stackoverflow.com/questions/43706492
复制相似问题