首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >尝试用PythonPy相似性或DamerauLevenshtein检查2列的字符串相似性

尝试用PythonPy相似性或DamerauLevenshtein检查2列的字符串相似性
EN

Stack Overflow用户
提问于 2021-11-16 21:22:15
回答 1查看 106关注 0票数 0

我试着用总结器来总结文本!问题是,我想看看这些文本是否太相似,要做到这一点,我可以在谷歌上看到,我可以使用软件包,如problem或fastDamerauLevenstein.问题和他们似乎只工作一个文本,只有.您知道如何做吗,例如4文本或更多?

代码语言:javascript
复制
from summarizers import Summarizers 
summ = Summarizers() 
data = ["The NN-CS89L offers next-level cooking convenience. Its four distinct cooking methods - steaming, baking, grilling and microwaving ensure your meals are cooked or reheated to perfection. Its multi-function capabilities can be combined to save time without compromising taste, texture or nutritional value. It’s the all-in-one kitchen companion designed for people with a busy lifestyle.", "These slim and stylish bodies are packed with high performance. The attractive compact designs and energy-saving functions help Panasonic Blu-ray products consume as little power as possible. You can experience great movie quality with this ultra-fast booting DMP-BD89 Full HD Blu-ray disc player. After starting the player, the time it takes from launching the menu to playing a disc is much shorter than in conventional models. The BD89 also allows for smart home networking (DLNA) and provides access to video on demand, so that home entertainment is more intuitive, more comfortable, and lots more fun."] 

df = pd.DataFrame(data, columns=['summaries'])
df['abstracts'] = df['summaries'].apply(summ)


compare(df.summaries, df.abstracts) ``` 




I have this : 
TypeError                                 Traceback (most recent call last)
<ipython-input-14-d1d78dc1f358> in <module>
----> 1 compare(df.summaries, df.abstracts)

~\Anaconda3\lib\site-packages\pysimilar\__init__.py in compare(self, string_i, string_j, isfile)
     89 
     90         if not isinstance(string_i, (str, Path)) or not isinstance(string_j, (str, Path)):
---> 91             raise TypeError(
     92                 'Both string i and string j must be of type either string or Path')
     93 

TypeError: Both string i and string j must be of type either string or Path

Thanks in advance !
EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2021-11-16 21:39:07

您需要创建一个包含两列值的行的函数,并对这两个列调用compare,然后将其应用于dataframe。

代码语言:javascript
复制
def compare_row_wise(row):
    return compare(row['summaries'], row['abstracts'])

df.apply(compare_row_wise, axis=1)
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/69996181

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档