首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >如何在字典的值中统计字符串的出现次数

如何在字典的值中统计字符串的出现次数
EN

Stack Overflow用户
提问于 2021-03-13 23:49:35
回答 3查看 22关注 0票数 0

我是Python的新手,但我正在帮助一个项目,以确定数据中有偏见的单词的数量。

我现在有一个密码词的列表:

male_coded_words = ['active','adventurous','aggress','ambitio','analy','assert']

我有一本职称和技能的字典:

jobsdict = {'fork lift truck driver': ['fork lift truck driv','assert], 'assistant fraud and payment risk manager': ['fraud', 'online fraud', 'fraud detect', 'payment system', 'risk manag'], 'paralegal vacancy corporate immigration (london office)': ['legal', 'microsoft offic', 'communication skil'], 'transport operator': ['transport','active'], 'year 5 primary teacher': ['newham'], 'multi agency safeguarding administrator': ['admin', 'social work', 'safeguard', 'social work admin', 'children administr', 'social work administr', 'safeguarding administr']

我想遍历字典,找出每个键在male_coded_words列表中出现的次数。

{'fork lift truck driver': "count":"1", "coded_words":["assert"].....}形式的字典形式的输出

到目前为止我的代码;

代码语言:javascript
复制
final_count = 0
final_output = {}

for k, v in jobsdict:
    final_output[k] = []
    if 'analy' in str(v):
        n = final_count + 1
    else:
        n = 0  
    final_output[k].append(n)
    final_output[k].append(v)
EN

回答 3

Stack Overflow用户

发布于 2021-03-14 01:06:42

这里的一个好主意是利用Python的set对象,它充当列表的无序替代品。集合上的操作往往比列表上的等价操作快得多。为简洁起见,我还使用了一个dictionary comprehension和一个Counter对象来自动计算编码的words.The的实例数,下面的脚本应该会给出您指定的输出:

代码语言:javascript
复制
from collections import Counter

# General form of the data provided above, for reference
# male_coded_words = ['active', ...]
# jobsdict = {'fork lift truck driver': ['fork lift truck driv','assert'], ...}

result = {k: Counter(set(v) & set(male_coded_words)) for k,v in jobsdict.items()}

# result will look like {'fork lift truck driver': {"assert": 1}, ...}.
# If no coded words exist for a specific job, then its value in the 
# result dict will just be an empty set.
票数 0
EN

Stack Overflow用户

发布于 2021-03-14 20:00:15

我是一个初学者,但我会研究正则表达式并对其进行计数(这是我能想到的最简单的方法)。

票数 0
EN

Stack Overflow用户

发布于 2021-03-14 20:25:08

除了使用少于2的for循环之外,我想不出其他方法,一个用于迭代jobsdict,另一个用于编码的单词。另外,使用jobsdict.items()通过键和值对其进行迭代:

代码语言:javascript
复制
final_count = 0
final_output = {}

for k, v in jobsdict.items():
    count, words = 0, []
    s = ''.join(v) 
    # merge all the strings into one to avoid a third nested loop iterating over them
    for w in male_coded_words:
        c = s.count(w) 
        # can be replaced with `w in s` if you don't want to count multiple occurrences of a word each time
        if c:
            count += c
            words.append(w)
    final_count += count
    final_output[k] = [count, words]

print(final_output, final_count)

这给了我以下输出:

代码语言:javascript
复制
{'fork lift truck driver': [1, ['assert']], 'assistant fraud and payment risk manager': [0, []], 'paralegal vacancy corporate immigration (london office)': [0, []], 'transport operator': [1, ['active']], 'year 5 primary teacher': [0, []], 'multi agency safeguarding administrator': [0, []]} 2

编辑:如果希望final_output中包含字典,请将倒数第二行替换为final_output[k] = {"count":count, "words":words}

票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/66615453

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档