首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >元组中的频率计算

元组中的频率计算
EN

Stack Overflow用户
提问于 2021-04-06 10:14:49
回答 1查看 45关注 0票数 1

我必须在一个元组中计算“Id”的频率,如下所示:

代码语言:javascript
复制
('{44371-zwart,40793,41878,44747,44371-wit}',), 
('{46022,47917,48267,48343,48221}',), 
('{43566,43834,31726,23503,4488}',), 
('{21896,9391,32171,30984-wit-3942,27211}',), 
('{35306,16901,24027,44222,38597}',), 
('{40867,40872,41437,31421,35570-grijs}',), 
('{32481,35728,36463,32473,43719}',)

这只是数据的一小部分(约0.5%)

我现在的代码是:

代码语言:javascript
复制
cur.execute('SELECT similars FROM profiles')
data = cur.fetchall()
c = Counter(elem[0] for elem in data)

它返回以下内容:

代码语言:javascript
复制
{
45110,46709,45109,45115,46462}': 1, 
'{38535,38529,38532,38527,38546}': 1, 
'{20062,17013,20634,21691,20622}': 1, 
'{21141,43588,39649,45900,17126}': 1, 
'{43552,41475,41478,32848,41477}': 1, 
'{42265,42266,43570,26203,28862}': 1, 
'{47874,47873,47878,47802-bruin,33101-avengers}': 1, 
'{26234,2401,30414,5655,16605}': 1, 
'{43405,43575,39649,21141,43195}': 1, 
'{35420,35422,35367,35418,35417}': 1, 
'{43195,47323,39649,43575,44454}': 1, 
'{9760,43572,9764,9768,9816}': 1

我期望/想要的结果是:

代码语言:javascript
复制
{'12392': 2, '7862': 1, '12313': 41}
EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2021-04-06 10:38:40

既然你得到了这个

代码语言:javascript
复制
'{45110,46709,45109,45115,46462}': 1, '{38535,38529,38532,38527,38546}': 1, '{20062,17013,20634,21691,20622}': 1, '{21141,43588,39649,45900,17126}': 1, '{43552,41475,41478,32848,41477}': 1, '{42265,42266,43570,26203,28862}': 1, '{47874,47873,47878,47802-bruin,33101-avengers}': 1, '{26234,2401,30414,5655,16605}': 1, '{43405,43575,39649,21141,43195}': 1, '{35420,35422,35367,35418,35417}': 1, '{43195,47323,39649,43575,44454}': 1, '{9760,43572,9764,9768,9816}': 1

在字典中转换此输出,以便您的第一级输出如下所示:

代码语言:javascript
复制
dct = {'{45110,46709,45109,45115,46462}': 1, '{38535,38529,38532,38527,38546}': 1, '{20062,17013,20634,21691,20622}': 1, '{21141,43588,39649,45900,17126}': 1, '{43552,41475,41478,32848,41477}': 1, '{42265,42266,43570,26203,28862}': 1, '{47874,47873,47878,47802-bruin,33101-avengers}': 1, '{26234,2401,30414,5655,16605}': 1, '{43405,43575,39649,21141,43195}': 1, '{35420,35422,35367,35418,35417}': 1, '{43195,47323,39649,43575,44454}': 1, '{9760,43572,9764,9768,9816}': 1
}

现在,创建一个空的id_corpus of list type,然后使用dct.keys()获取该字典的所有键,并在这些键上启动一个循环。

现在,使用replace()str class方法删除第一个也是最后一个括号,并使用split()方法将剩余的字符串解压到list中。这个新表单列表添加到id_corpus中。记住不要append,使用+运算符添加它

最后,创建一个空的语料库字典并对id_corpus中的元素进行迭代,如果该元素存在于语料库字典中,则将其值增加1,否则将其值设置为1。

这是最后的解决方案

代码语言:javascript
复制
# Since I don't know how your data looks like
# and in what format are you getting data from MySQL
# that's why I am appending your solution
# A more optimized approach can be developed
# if I know more about the problem
dct = {'{45110,46709,45109,45115,46462}': 1, '{38535,38529,38532,38527,38546}': 1, '{20062,17013,20634,21691,20622}': 1, '{21141,43588,39649,45900,17126}': 1, '{43552,41475,41478,32848,41477}': 1, '{42265,42266,43570,26203,28862}': 1, '{47874,47873,47878,47802-bruin,33101-avengers}': 1, '{26234,2401,30414,5655,16605}': 1, '{43405,43575,39649,21141,43195}': 1, '{35420,35422,35367,35418,35417}': 1, '{43195,47323,39649,43575,44454}': 1, '{9760,43572,9764,9768,9816}': 1}
lst = []
for ky in dct.keys():
    ky = ky.replace('{', '')
    ky = ky.replace('}', '')
    ky = ky.split(',')
    lst += ky

sol = dict()
for id in lst:
    if id in sol.keys():
        sol[id] += 1
    else:
        sol[id] = 1

print(sol)

输出

代码语言:javascript
复制
{'16605': 1, '44454': 1, '45900': 1, '20634': 1, '46462': 1, '35422': 1, '35420': 1, '17013': 1, '38532': 1, '47323': 1, '21141': 2, '43405': 1, '38527': 1, '17126': 1, '9816': 1, '38529': 1, '35418': 1, '45109': 1, '2401': 1, '41477': 1, '41478': 1, '41475': 1, '47802-bruin': 1, '26234': 1, '32848': 1, '35367': 1, '43195': 2, '20622': 1, '43588': 1, '35417': 1, '9760': 1, '38546': 1, '9764': 1, '28862': 1, '26203': 1, '9768': 1, '5655': 1, '39649': 3, '47874': 1, '43552': 1, '47873': 1, '38535': 1, '21691': 1, '30414': 1, '20062': 1, '43570': 1, '42266': 1, '42265': 1, '43575': 2, '46709': 1, '43572': 1, '47878': 1, '45110': 1, '33101-avengers': 1, '45115': 1}
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/66966735

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档