文章/答案/技术大牛

发布

社区首页 >问答首页 >统计字符串中二元词的出现次数，并保存到字典中

问统计字符串中二元词的出现次数，并保存到字典中
EN

Stack Overflow用户

提问于 2020-03-09 22:48:55

回答 3查看 190关注 0票数 1

我用Python编写代码，我有一个字符串，我想要计算该字符串中二元语法出现的次数。我的意思是，例如，我有一个字符串"test string"，我想在大小为2的子字符串中迭代该字符串，并为每个二元语法及其在原始字符串中出现的次数创建一个字典。

因此，我希望获得表单{te: 1, es : 1, st: 2, ...}的输出。

你能帮我开始吗？

诚挚的问候!

python

string

dictionary

回答 3

Stack Overflow用户

发布于 2020-03-09 22:58:11

给定的

s = "test string"

做

from collections import Counter
Counter(map(''.join, zip(s, s[1:])))

或

from collections import Counter
Counter(s[i:i+2] for i in range(len(s)-1))

两种方法的结果都是

Counter({'st': 2, 'te': 1, 'es': 1, 't ': 1, ' s': 1, 'tr': 1, 'ri': 1, 'in': 1, 'ng': 1})

票数 3

Stack Overflow用户

发布于 2020-03-09 22:58:29

顺便说一句，您要查找的是bigrams。对于更大的规模-在不同的机器学习/NLP工具包中有健壮的实现。

作为一种特别的解决方案，问题应该分解为

遍历sequence
Count unique中的"current elements

next elements“。

问题#1的解决方案是来自itertools recipes的pairwise

问题2的解决方案是Counter

把这些放在一起就是

from itertools import tee

def pairwise(iterable):
    a, b = tee(iterable)
    next(b, None)
    return zip(a, b)

Counter(pairwise('test string'))

票数 1

Stack Overflow用户

发布于 2020-03-09 23:03:19

我认为这样的事情很简单，很容易做，并且不需要import任何库。

首先，我们使用join()删除字符串中的所有空格。

然后，我们构造一个包含所有子字符串的list，步骤为2。

最后，我们构造并print() dictionary，它将所有的子字符串作为关键字，并将它们在原始字符串中出现的相应位置作为值。

substr = [] # Initialize empty list that contains all substrings.
step = 2 # Initialize your step size.
s = ''.join('test string'.split()) # Remove all whitespace from string.
for i in range(len(s)):
    substr.append(s[i: i + step])
# Construct and print a dictionary which counts all occurences of substrings.
occurences = {k: substr.count(k) for k in substr if len(k) == step}
print(occurences)

运行时，它会按照您的请求输出一个字典：

{'te': 1, 'es': 1, 'st': 2, 'ts': 1, 'tr': 1, 'ri': 1, 'in': 1, 'ng': 1}

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/60603002

复制

相似问题

问统计字符串中二元词的出现次数，并保存到字典中
EN

回答 3

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问统计字符串中二元词的出现次数，并保存到字典中EN

回答 3

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问统计字符串中二元词的出现次数，并保存到字典中
EN