嗨,我有两本不同的字典。我正在尝试通过删除重复项来合并这两个文件。这是两个列表。
x = [{'relevance': 0.722, 'type': 'Company', 'name': 'Dell'}, {'relevance': 0.314, 'type': 'OperatingSystem', 'name': 'VMs'}, {'relevance': 0.122, 'type': 'Technology', 'name': 'iSCSI'}, {'relevance': 0.266, 'type': 'Company', 'name': 'Force10'}, {'relevance': 0.327, 'type': 'Person', 'name': 'Greg Althaus'}, {'relevance': 0.085, 'type': 'URL', 'name': 'http://Dell.com/OpenStack'}, {'relevance': 0.174, 'type': 'Company', 'name': 'Storage Hardware'}]
y = [{'relevance': u'0.874065', 'type': u'Company', 'name': u'Dell'}, {'relevance': u'0.522169', 'type': u'OperatingSystem', 'name': u'VMs'}, {'relevance': u'0.444586', 'type': u'Person', 'name': u'Rob Hirschfeld'}, {'relevance': u'0.413988', 'type': u'Person', 'name': u'Greg Althaus'}, {'relevance': u'0.376489', 'type': u'FieldTerminology', 'name': u'iSCSI'}, {'relevance': u'0.314059', 'type': u'Company', 'name': u'Force10'}]我试着去做
z = x.update(y)
print x它给了我这个错误
AttributeError: 'str' object has no attribute 'update'`我已经试过了
z = dict(x.items() + y.items())它给了我这个错误
AttributeError: 'str' object has no attribute 'items'然后我试着
z = dict(x, **y)它给了我这个错误
TypeError: type object argument after ** must be a mapping, not str然后我试着
z = dict(chain(x.iteritems(), y.iteritems()))它给了我这个错误
AttributeError: 'str' object has no attribute 'iteritems'发布于 2012-03-27 21:59:58
如果您希望创建一个新的字典列表,并希望通过删除重复项来合并它们,这将是一个简单的工作。
def DictListUpdate( lis1, lis2):
for aLis1 in lis1:
if aLis1 not in lis2:
lis2.append(aLis1)
return lis2
x = [ {"name": "surya", "company":"dell"}, \
{"name": "jobs", "company":"apple"} ]
y = [ { "name": "surya", "company":"dell"}, \
{ "name": "gates", "company": "microsoft"} ]
print DictListUpdate(x,y)输出:
>>>
[{'company': 'dell', 'name': 'surya'}, {'company': 'microsoft', 'name': 'gates'}, {'company': 'apple', 'name': 'jobs'}]发布于 2012-03-27 21:31:19
您可以将字符串中的列表转换为按名称键入的dict,然后更新:
import ast
x = "[{'relevance': 0.722, 'type': 'Company', 'name': 'Dell'}, {'relevance': 0.314, 'type': 'OperatingSystem', 'name': 'VMs'}, {'relevance': 0.122, 'type': 'Technology', 'name': 'iSCSI'}, {'relevance': 0.266, 'type': 'Company', 'name': 'Force10'}, {'relevance': 0.327, 'type': 'Person', 'name': 'Greg Althaus'}, {'relevance': 0.085, 'type': 'URL', 'name': 'http://Dell.com/OpenStack'}, {'relevance': 0.174, 'type': 'Company', 'name': 'Storage Hardware'}]"
y = "[{'relevance': u'0.874065', 'type': u'Company', 'name': u'Dell'}, {'relevance': u'0.522169', 'type': u'OperatingSystem', 'name': u'VMs'}, {'relevance': u'0.444586', 'type': u'Person', 'name': u'Rob Hirschfeld'}, {'relevance': u'0.413988', 'type': u'Person', 'name': u'Greg Althaus'}, {'relevance': u'0.376489', 'type': u'FieldTerminology', 'name': u'iSCSI'}, {'relevance': u'0.314059', 'type': u'Company', 'name': u'Force10'}]"
# make a dictionary with the names as keys
x, y = (dict((d['name'], d)
# after loading the lists out of the strings safely
for d in ast.literal_eval(lst))
# for each of the two strings
for lst in (x, y))
# or on Python 2.7+:
x, y = ({d['name']: d for d in ast.literal_eval(lst)} for lst in (x, y))
# combine the two dicts
x.update(y)然后,如果你想要回一个列表,它只是
x.values()你在标题中提到了排序。如果要按名称对列表进行排序:
import operator
sorted(x.itervalues(), key = operator.itemgetter('name'))发布于 2012-03-27 22:34:18
首先要注意的是,你没有两个不同的字典。您有两个不同的字典列表。第二个问题是你不能准确地解释什么是重复的。第三个问题是,您没有说明如何处理relevance密钥。
我将假设两个具有相同type和name键的字典是相同的,并且您希望将relevance值合并到一个列表中。然后你可以取它们的平均值,或者其他任何东西。
def gen_key(d):
return (d['name'], d['type'])
def merge_dupes(dlist):
relevance = [float(d['relevance']) for d in dlist]
name, type = dlist[0]['name'], dlist[0]['type']
return {'name':name, 'type':type, 'relevance':relevance}
to_merge = {}
for l in (x, y):
for d in l:
to_merge.setdefault(gen_key(d), []).append(d)
# if you want another list
merged_list = [merge_dupes(l) for l in to_merge.itervalues()]
# if you'd prefer a dictionary
merged_dict = dict((k, merge_dupes(v)) for k, v in to_merge.iteritems())输出:
>>> pprint(merged_list)
[{'name': u'Rob Hirschfeld',
'relevance': [0.44458599999999998],
'type': u'Person'},
{'name': 'VMs',
'relevance': [0.314, 0.52216899999999999],
'type': 'OperatingSystem'},
{'name': 'Greg Althaus',
'relevance': [0.32700000000000001, 0.41398800000000002],
'type': 'Person'},
{'name': 'Storage Hardware',
'relevance': [0.17399999999999999],
'type': 'Company'},
{'name': u'iSCSI',
'relevance': [0.37648900000000002],
'type': u'FieldTerminology'},
{'name': 'Force10',
'relevance': [0.26600000000000001, 0.31405899999999998],
'type': 'Company'},
{'name': 'http://Dell.com/OpenStack',
'relevance': [0.085000000000000006],
'type': 'URL'},
{'name': 'Dell',
'relevance': [0.72199999999999998, 0.87406499999999998],
'type': 'Company'},
{'name': 'iSCSI', 'relevance': [0.122], 'type': 'Technology'}]
>>> pprint(merged_dict)
{('Dell', 'Company'): {'name': 'Dell',
'relevance': [0.72199999999999998,
0.87406499999999998],
'type': 'Company'},
('Force10', 'Company'): {'name': 'Force10',
'relevance': [0.26600000000000001,
0.31405899999999998],
'type': 'Company'},
('Greg Althaus', 'Person'): {'name': 'Greg Althaus',
'relevance': [0.32700000000000001,
0.41398800000000002],
'type': 'Person'},
(u'Rob Hirschfeld', u'Person'): {'name': u'Rob Hirschfeld',
'relevance': [0.44458599999999998],
'type': u'Person'},
('Storage Hardware', 'Company'): {'name': 'Storage Hardware',
'relevance': [0.17399999999999999],
'type': 'Company'},
('VMs', 'OperatingSystem'): {'name': 'VMs',
'relevance': [0.314, 0.52216899999999999],
'type': 'OperatingSystem'},
('http://Dell.com/OpenStack', 'URL'): {'name': 'http://Dell.com/OpenStack',
'relevance': [0.085000000000000006],
'type': 'URL'},
(u'iSCSI', u'FieldTerminology'): {'name': u'iSCSI',
'relevance': [0.37648900000000002],
'type': u'FieldTerminology'},
('iSCSI', 'Technology'): {'name': 'iSCSI',
'relevance': [0.122],
'type': 'Technology'}}https://stackoverflow.com/questions/9890364
复制相似问题