文章/答案/技术大牛

发布

社区首页 >问答首页 >如何对词条检索结果进行编码统计词条出现次数

问如何对词条检索结果进行编码统计词条出现次数
EN

Stack Overflow用户

提问于 2019-10-24 10:21:55

回答 4查看 112关注 0票数 2

我正在尝试计算列表中出现的单词的数量。我需要的结果是(word, # of occurrence)，但是我一直在获取(word, 1) (word, 2) (word,3)，而它应该给我(word,3)。

library、document和dictionary的所有变量都在另一个区域定义。

我相信我的代码是99%正确的，但结果并不是我想要的结果。

def (word_search) : 
    results = [] 

    search_word = dictionary [0]

    for search_word in dictionary: 

    count = 0 

    for document in library: 

       for word in document: 

          if search_word == word : 

            count = count + 1

            results.append((word,count)) 

     return (results)

python

list

tuples

回答 4

Stack Overflow用户

发布于 2019-10-24 10:27:17

这是因为results是一个元组列表，每当您找到一个新词出现时，您就会一直向它追加一个值。return (results[-1])应该可以工作，但是有一种更简单的方法来编写这个函数，而不是使用列表。既然你还在学习，我就让你自己弄明白:)

票数 0

Stack Overflow用户

发布于 2019-10-24 10:35:09

也许你必须在循环之后进行标识：

results = [] 

search_word = dictionary [0]

for search_word in dictionary: 

   count = 0 

   for document in library: 

      for word in document: 

         if search_word == word : 

           count = count + 1

           results.append((word,count)) 

 return (results)

票数 0

Stack Overflow用户

发布于 2019-10-24 16:16:42

试试使用Python dict (与您的变量字典不同)的解决方案如何？事实上，python提供了一个非常好的Python dict版本，称为defaultdict，如果键不存在，可以将其初始化为特定值。

您可以像这样编写代码：

from collections import defaultdict

def (word_search) : 
    results = defaultdict(int) # Make the dict use integers as the default entry value, set it to 0 if key does not exist

    search_word = dictionary [0]

    for search_word in dictionary: 

       for document in library: 

           for word in document: 

               if search_word == word : 

                   results[word] += 1 # Increment the count for the matched word


    return results.items() # Return the counts as a set of tuples

这将产生一组包含每个单词计数的元组！

注意:我还修复了for循环的缩进，以防导致问题

此外，为了提高效率，您可以生成所有单词的计数，并在最后检索搜索单词的计数，从而将复杂度从O(n^3)降低到O(n^2)：

from collections import defaultdict

def (word_search) : 
    counts = defaultdict(int) # Make the dict use integers as the default entry value, set it to 0 if key does not exist
    for document in library: 

       for word in document: 

           counts[word] += 1 # Increment the count the given word

    # Loop through and extract just the counts of the words you're interested in
    results = []

    for search_word in dictionary: 
        results.append((search_word, counts[search_word]))

    return results

如果你的文档非常大，这应该会大大减少你的运行时间！

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/58533249

复制

相似问题

问如何对词条检索结果进行编码统计词条出现次数
EN

回答 4

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何对词条检索结果进行编码统计词条出现次数EN

回答 4

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何对词条检索结果进行编码统计词条出现次数
EN