我对这样的一张字典清单有意见:
list_validation = [{'name': 'Alice', 'street': 'Baker Street', 'stamp': 'T05', 'city': 'London'}, {'name': 'Margaret', 'street': 'Castle Street', 'stamp': 'T01', 'city': 'Cambridge'}, {'name': 'Fred', 'street': 'Baker Street', 'stamp': 'T012', 'city': 'London'}]在我的dataframe中有列
df = pd.DataFrame({'name': ['Fred', 'Jane', 'Alice', 'Margaret'], 'street': ['Baker Street', 'Downing Street', 'Baker Street', 'Castle Street'],
'stamp': ['', 'T03', '', ''],
'city': ['', 'London', '', ''],
'other irrelevant columns for this task' : [1, 2, 3, 4]
})我想要的是填补邮票栏和城市栏的空白,如下所示:
df2 = pd.DataFrame({'name': ['Fred', 'Jane', 'Alice', 'Margaret'], 'street': ['Baker Street', 'Downing Street', 'Baker Street', 'Downing Street'],
'stamp': ['T012', 'T03', 'T05', 'T01'],
'city': ['London', 'London', 'London', 'Cambridge'],
'other irrelevant columns for this task' : [1, 2, 3, 4]
})我一直在尝试这一点,但它并不有效,而且进展不佳:
new_dict = df[['name', 'street', 'stamp', 'city']].to_dict()
list(new_dict)
for l in list_validation:
for row in new_dict:
if l['name'] == row['name'] and l['street'] == row['street']:
row['stamp'] = l['stamp']
row['city'] = l['city']发布于 2021-10-13 18:09:40
这是对dataframe中的每一行进行迭代并填充列表中缺少的值的一种方法。
清单定义:
list_validation = [{'name': 'Alice', 'street': 'Baker Street', 'stamp': 'T05', 'city': 'London'}, {'name': 'Margaret', 'street': 'Castle Street', 'stamp': 'T01', 'city': 'Cambridge'}, {'name': 'Fred', 'street': 'Baker Street', 'stamp': 'T012', 'city': 'London'}]DataFrame定义:
df = pd.DataFrame({'name': ['Fred', 'Jane', 'Alice', 'Margaret'], 'street': ['Baker Street', 'Downing Street', 'Baker Street', 'Castle Street'],
'stamp': ['', 'T03', '', ''],'city': ['', 'London', '', ''],'other irrelevant columns for this task' : [1, 2, 3, 4]})逻辑
for r,i in df.iterrows():
name_in_df = i['name']
# if pd.isna(i['stamp']):
if not i['stamp']:
for j in list_validation:
if j['name'] == name_in_df:
value_in_list = j['stamp']
df.loc[r,'stamp'] = value_in_list
break
# if pd.isna(i['city']):
if not i['city']:
name_in_df = i['name']
for j in list_validation:
if j['name'] == name_in_df:
value_in_list = j['city']
df.loc[r,'city'] = value_in_list
break
df 发布于 2021-10-13 17:59:29
下面是我要用的方法
street
name,并将list_validation设置为新的数据格式,并将其索引设置为name和street。
df1中的空值并使用df2的值填充掩码值
c = ['name', 'street']
df1 = df.set_index(c)
df2 = pd.DataFrame(list_validation).set_index(c)
df1.mask(df1.eq('')).fillna(df2).reset_index() name street stamp city other irrelevant columns for this task
0 Fred Baker Street T012 London 1
1 Jane Downing Street T03 London 2
2 Alice Baker Street T05 London 3
3 Margaret Castle Street T01 Cambridge 4https://stackoverflow.com/questions/69559570
复制相似问题