我有3个df,我想合并,其中前3列是相同的数据,如果存在,然后是新的列从每个df。例如,df3:与df23不同:
如果它们具有相同的唯一标识符,我会合并它们,否则我会合并它们。
df1
ID A B 2009 2010
1 A B 2 3
2 A C 2 2
3 A B 3 3df2
ID A B 2011 2012
2 A C 2 2
3 A C 3 4
5 A B 8 9df3
ID A B 2013 2014
2 A C 2 3
4 A E 3 4
5 A B 8 9结果df
ID A B 2009 2010 2011 2012 2013 2014
1 A B 2 3. 2. 3.
2 A C 2 2. 2. 2. 2. 3
3 A C 3 3. 3. 4.
4 A E 3. 4
5 A B 8 9 8. 9编辑:固定df数据。其次,我注意到的一个问题是,当我合并时,我的数据A和B是重复的,A_X、A_Y、A_Z、B_X、B_Y、B_Z提前谢谢
发布于 2020-12-05 02:31:07
结果出了点问题。
但合并的代码将如下所示:
from functools import reduce
import pandas as pd
dfs = [df1,df2,df3]
df_merged = reduce(lambda left,right: pd.merge(left,right,on=['ID'],
how='outer'), dfs)df_merged:
ID 2009 2010 2011 2012 2013 2014
0 1 2.0 3.0 2.0 3.0 NaN NaN
1 2 3.0 4.0 3.0 4.0 2.0 3.0
2 3 4.0 5.0 4.0 5.0 NaN NaN
3 4 NaN NaN NaN NaN 3.0 4.0
4 5 NaN NaN NaN NaN 8.0 9.0编辑:
只需使用on=['ID', 'A', 'B']即可
输出:
ID A B 2009 2010 2011 2012 2013 2014
0 1 A B 2.0 3.0 NaN NaN NaN NaN
1 2 A C 2.0 2.0 2.0 2.0 2.0 3.0
2 3 A B 3.0 3.0 NaN NaN NaN NaN
3 3 A C NaN NaN 3.0 4.0 NaN NaN
4 5 A B NaN NaN 8.0 9.0 8.0 9.0
5 4 A E NaN NaN NaN NaN 3.0 4.0发布于 2020-12-05 02:31:33
试试pd.concat([df.set_index('ID') for df in [df1, df2, df3]], axis=1).reset_index()
列表理解将ID设置为每个数据帧的索引。然后我们水平连接。水平连接尝试在可能的情况下匹配索引,否则它会添加行。最后,我们重置索引。
https://stackoverflow.com/questions/65148665
复制相似问题