我是一个负责2.7版本的python新手。下面是我正在使用的数据框架的示例。还有一些与问题无关的额外列,因此这些列不包括在下面。
df = pd.DataFrame( { "Name" : ["BROD", "BROD", "BROD", "BROD", "SSBD" , "SSBD","SSBD","SSBD"] ,
"Digit" : ["F", "F", "T", "T", "F", "F", "T", "T"],
"ID": ["A","A","A","A","B","B","B","B"],
"Date": ["2/3/2010","2/3/2010","2/3/2010","2/3/2010","3/4/2007","3/4/2007","3/4/2007","3/4/2007"],
"Base" : ["CAD","CAD","CAD","CAD","CAD","CAD","CAD","CAD"],
"Term" : ["USD","USD","JPY","JPY","EUR","EUR","JPY","JPY"],
"Amt": [100.00,100.00,9082.00,9082.00,60.00,60.00,7387.80,7387.80]})有多个重复的值。每一行代表交易的一个组件,ID列将它们分组为一个交易。我想要创建一个新的数据框架,其中只包括一个交易一行。数据框架如下所示:
ID Date Name Buy Sell Buy Amt Sell Amt
A 2/3/2010 BROD USD JPY 100.00 9082.00
B 3/4/2007 SSBD EUR JPY 60.00 7387.80对于每个ID,如果数字=F,则术语列中的值放在Buy列中,Amt列中的值放置在Buy列中。如果数字=T,则术语列中的值放置在Sell列中,Amt列中的值放在Sell列中。
请指出正确的方向,以最有效的方式解决这个问题。谢谢。
发布于 2018-09-12 20:05:25
您可以使用np.where,然后使用groupby
df['Buy'] = np.where((df['Digit'] == 'F'), df['Term'], np.nan)
df['Sell'] = np.where((df['Digit'] == 'T'), df['Term'], np.nan)
df['BuyAmt'] = np.where((df['Digit'] == 'F'), df['Amt'], np.nan)
df['SellAmt'] = np.where((df['Digit'] == 'T'), df['Amt'], np.nan)
df.drop(['Digit','Base','Term','Amt'], axis=1, inplace= True)
df = df.groupby('ID').first()
print(df)
Name Date Buy Sell BuyAmt SellAmt
ID
A BROD 2/3/2010 USD JPY 100.0 9082.0
B SSBD 3/4/2007 EUR JPY 60.0 7387.8此外,如果您需要您的专栏按您发布的顺序,您可以使用pandas reindex
发布于 2018-09-12 21:49:37
我假设应该删除重复项,否则您需要更好地解释如何处理相同的行:
>>> df2 = df.drop_duplicates().reset_index(drop=True)然后,我们创建两个数据格式,一个用于'F‘,一个用于'T',为每个数据创建Buy/Sell和Buy Amt/Sell Amt,并删除未使用的列:
>>> df_F = df2[df2.Digit == 'F'].assign(**{'Buy': lambda x: x.Term, 'Buy Amt': lambda x: x.Amt})
... .drop(['Digit', 'Base', 'Term', 'Amt'], axis=1)
>>> df_T = df2[df2.Digit == 'T'].assign(**{'Sell': lambda x: x.Term, 'Sell Amt': lambda x: x.Amt})
... .drop(['Digit', 'Base', 'Term', 'Amt'], axis=1)最后,我们合并这两个数据格式,并重新排列列顺序:
>>> merged = df_F.merge(df_T, on=['ID', 'Name', 'Date'])
>>> merged[['ID', 'Date', 'Name', 'Buy', 'Sell', 'Buy Amt', 'Sell Amt']]
ID Date Name Buy Sell Buy Amt Sell Amt
0 A 2/3/2010 BROD USD JPY 100.0 9082.0
1 B 3/4/2007 SSBD EUR JPY 60.0 7387.8就这样。如果“ID”应该是索引,则可以使用merged.set_index('ID')
https://stackoverflow.com/questions/52302465
复制相似问题