我有以下数据:
{'POINT_ID': {0: 'ABC B4-14 c1-1', 1: 'ABC B4-14 c1-2', 2: 'ABC 14-10 c6-2', 3: 'ABC 14-10 c6-3', 4: 'ABC 14-03 c12-1', 5: 'ABC 14-03 c12-2', 6: 'ABC 14-01A c9-2', 7: 'ABC 14-01A c9-3', 8: 'DE 368 c1-1', 9: 'DE 368 c1-2', 10: 'DE 483 c3-2', 11: 'DE 483 c3-3', 12: 'FG 1 c4-1', 13: 'HI 1A c4-3'}, 'Count': {0: 1.25, 1: 2.66, 2: 3.5, 3: 6.6, 4: 9.54, 5: 10.57, 6: 11.8, 7: 2.5, 8: 1.1, 9: 1.04, 10: 2.22, 11: 3.22, 12: 1.5, 13: 1.2}}我希望使用以下字典映射列POINT_ID中的值:
labels = {'ABC B4-14': ['Local 1', 'Context 1'],
'ABC 14-10': ['Local 2', 'Context 2'],
'ABC 14-03': ['Local 2', 'Context 2'],
'ABC 14-01A': ['Local 1', 'Context 1'],
'DE 368': ['Local 3', 'Context 3'],
'DE 483': ['Local 3', 'Context 4'],
'FG 1': ['Local 4', 'Context 5'],
'HI 1A': ['Local 5', 'Context 6']}因此,我可以生成以下数据:
{'POINT_ID': {0: 'ABC B4-14 c1-1', 1: 'ABC B4-14 c1-2', 2: 'ABC 14-10 c6-2', 3: 'ABC 14-10 c6-3', 4: 'ABC 14-03 c12-1', 5: 'ABC 14-03 c12-2', 6: 'ABC 14-01A c9-2', 7: 'ABC 14-01A c9-3', 8: 'DE 368 c1-1', 9: 'DE 368 c1-2', 10: 'DE 483 c3-2', 11: 'DE 483 c3-3', 12: 'FG 1 c4-1', 13: 'HI 1A c4-3'}, 'Count': {0: 1.25, 1: 2.66, 2: 3.5, 3: 6.6, 4: 9.54, 5: 10.57, 6: 11.8, 7: 2.5, 8: 1.1, 9: 1.04, 10: 2.22, 11: 3.22, 12: 1.5, 13: 1.2}, 'Local': {0: 'Local 1', 1: 'Local 1', 2: 'Local 2', 3: 'Local 2', 4: 'Local 2', 5: 'Local 2', 6: 'Local 1', 7: 'Local 1', 8: 'Local 3', 9: 'Local 3', 10: 'Local 3', 11: 'Local 3', 12: 'Local 4', 13: 'Local 5'}, 'Context': {0: 'Context 1', 1: 'Context 1', 2: 'Context 2', 3: 'Context 2', 4: 'Context 2', 5: 'Context 2', 6: 'Context 1', 7: 'Context 1', 8: 'Context 3', 9: 'Context 3', 10: 'Context 4', 11: 'Context 4', 12: 'Context 5', 13: 'Context 6'}}字典映射键包含在POINT_ID值中,但它们并不完全匹配。此外,从字典映射值中,有一个列表,列表的每个成员都需要生成一个不同的列。
发布于 2022-04-04 18:48:40
您可以在提取的DataFrame的第一部分上从字典和merge中创建一个POINT_ID:
df2 = pd.DataFrame(labels).set_axis(['Local', 'Context']).T
ID = df['POINT_ID'].str.extract('^(.*?) [\S]+$', expand=False)
out = df.merge(df2, left_on=ID, right_index=True).drop(columns='key_0')其他选项,map和join
ID = df['POINT_ID'].str.extract('^(.*?) [\S]+$', expand=False)
out = df.join(pd.DataFrame(ID.map(labels).to_list(), columns=['Local', 'Context']))产出:
POINT_ID Count Local Context
0 ABC B4-14 c1-1 1.25 Local 1 Context 1
1 ABC B4-14 c1-2 2.66 Local 1 Context 1
2 ABC 14-10 c6-2 3.50 Local 2 Context 2
3 ABC 14-10 c6-3 6.60 Local 2 Context 2
4 ABC 14-03 c12-1 9.54 Local 2 Context 2
5 ABC 14-03 c12-2 10.57 Local 2 Context 2
6 ABC 14-01A c9-2 11.80 Local 1 Context 1
7 ABC 14-01A c9-3 2.50 Local 1 Context 1
8 DE 368 c1-1 1.10 Local 3 Context 3
9 DE 368 c1-2 1.04 Local 3 Context 3
10 DE 483 c3-2 2.22 Local 3 Context 4
11 DE 483 c3-3 3.22 Local 3 Context 4
12 FG 1 c4-1 1.50 Local 4 Context 5
13 HI 1A c4-3 1.20 Local 5 Context 6发布于 2022-04-04 18:52:28
我想我要把所有的东西都搬到最后一个地方。
df2 = pd.DataFrame(
df.POINT_ID.str.rsplit(n=1).str[0].map(labels).tolist(), # Where magic happens
index=df.index, columns=['Local', 'Context']
)
df.join(df2)
POINT_ID Count Local Context
0 ABC B4-14 c1-1 1.25 Local 1 Context 1
1 ABC B4-14 c1-2 2.66 Local 1 Context 1
2 ABC 14-10 c6-2 3.50 Local 2 Context 2
3 ABC 14-10 c6-3 6.60 Local 2 Context 2
4 ABC 14-03 c12-1 9.54 Local 2 Context 2
5 ABC 14-03 c12-2 10.57 Local 2 Context 2
6 ABC 14-01A c9-2 11.80 Local 1 Context 1
7 ABC 14-01A c9-3 2.50 Local 1 Context 1
8 DE 368 c1-1 1.10 Local 3 Context 3
9 DE 368 c1-2 1.04 Local 3 Context 3
10 DE 483 c3-2 2.22 Local 3 Context 4
11 DE 483 c3-3 3.22 Local 3 Context 4
12 FG 1 c4-1 1.50 Local 4 Context 5
13 HI 1A c4-3 1.20 Local 5 Context 6https://stackoverflow.com/questions/71742205
复制相似问题