文章/答案/技术大牛

发布

社区首页 >问答首页 >如何根据字典列表中的另一个值有效地查找字典值

问如何根据字典列表中的另一个值有效地查找字典值
EN

Stack Overflow用户

提问于 2021-12-09 15:15:49

回答 3查看 284关注 0票数 1

我有一个非常大的字典清单(~100 K)：

[{'sequence': 'read the rest of this note', 'score': 0.22612378001213074, 'token': 3805, 'token_str': 'note'}, {'sequence': 'read the rest of this page', 'score': 0.11293990164995193, 'token': 3674, 'token_str': 'page'}, {'sequence': 'read the rest of this week', 'score': 0.06504543870687485, 'token': 1989, 'token_str': 'week'}]

给定一个token ID (例如1989)，如何有效地找到相应的score？对于每个列表，我必须多次这样做(我有几个这样的大列表，对于每个列表，我有几个令牌ID)。

目前，我正在遍历列表中的每个字典，并检查ID是否与我的输入ID匹配，如果匹配，我将得到score。但速度很慢。

python

dictionary

回答 3

Stack Overflow用户

回答已采纳

发布于 2021-12-09 15:22:15

由于您必须多次搜索，所以可能创建一个以令牌为键的字典：

a = [{'sequence': 'read the rest of this note', 'score': 0.22612378001213074, 'token': 3805, 'token_str': 'note'}, {'sequence': 'read the rest of this page', 'score': 0.11293990164995193, 'token': 3674, 'token_str': 'page'}, {'sequence': 'read the rest of this week', 'score': 0.06504543870687485, 'token': 1989, 'token_str': 'week'}]

my_dict = {i['token']: i for i in a}

创建dict需要一些时间，但是每次搜索之后都是O(1)。

这看起来可能效率低下，但python以非常高效的方式处理内存，因此，与在新的list上创建相同的字典不同，它实际上保存了对列表中已经构造的dict的引用，您可以使用以下方法确认：

>>> a[0] is my_dict[3805]
True

因此，您可以将其解释为为列表中的每个元素创建别名。

票数 5

Stack Overflow用户

发布于 2021-12-09 15:53:30

对大型数据集来说，使用熊猫可能更有效。

使用令牌3805查找得分的示例：

import pandas as pd

source_list = [{'sequence': 'read the rest of this note', 'score': 0.22612378001213074, 'token': 3805, 'token_str': 'note'}, {'sequence': 'read the rest of this page', 'score': 0.11293990164995193, 'token': 3674, 'token_str': 'page'}, {'sequence': 'read the rest of this week', 'score': 0.06504543870687485, 'token': 1989, 'token_str': 'week'}]

df = pd.DataFrame(source_list)
result = df[df.token == 3805]

print(result.score.values[0])

票数 0

Stack Overflow用户

发布于 2021-12-09 16:18:07

如果您的字典列表是：

l = [{'sequence': 'read the rest of this note', 'score': 0.22612378001213074, 'token': 3805, 'token_str': 'note'}, {'sequence': 'read the rest of this page', 'score': 0.11293990164995193, 'token': 3674, 'token_str': 'page'}, {'sequence': 'read the rest of this week', 'score': 0.06504543870687485, 'token': 1989, 'token_str': 'week'}]

您感兴趣的token值是：

token_values = [1989, 30897, 98762]

然后：

按以下方式构建一个字典：

d = {the_dict['token']: the_dict['score']
    for the_dict in l where the_dict['token'] in token_values}

这将构建一个包含您感兴趣的键值及其相应分数的最小字典。

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/70292305

复制

相似问题

问如何根据字典列表中的另一个值有效地查找字典值
EN

回答 3

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何根据字典列表中的另一个值有效地查找字典值EN

回答 3

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何根据字典列表中的另一个值有效地查找字典值
EN