我想按df"A“进行分组,并导出df"B”中对应于df"C“中前两个最小值的值。
df = pd.DataFrame({'A': ['foo', 'bar', 'foo', 'bar',
'foo', 'bar', 'foo', 'foo'],
'B': ['cat', 'dog', 'rat', 'lion',
'bat', 'racoon', 'possum', 'deer'],
'C': [1, 2, 6, 4, 3, 1, 2, 4]})我希望结果是:
A B_1 B_2
0 foo cat possum
1 bar racoon dog发布于 2017-02-18 06:14:04
我觉得你需要:
groupby与nsmallestunstack进行整形1添加到列名add_prefix last和reset_indexdf1 = df.set_index('B')
.groupby('A', sort=False)['C']
.apply(lambda x: pd.Series(x.nsmallest(2).index))
.unstack()
df1.columns = df1.columns + 1
df1 = df1.add_prefix('B_').reset_index()
print (df1)
A B_1 B_2
0 foo cat possum
1 bar racoon dog一行解决方案:
df1 = df.set_index('B')
.groupby('A', sort=False)['C']
.apply(lambda x: pd.Series(x.nsmallest(2).index, index =['B_1','B_2']))
.unstack()
.reset_index()
print (df1)
A B_1 B_2
0 foo cat possum
1 bar racoon dog编辑:
它也与datetime完美地工作:
np.random.seed(100)
df = pd.DataFrame({'A': ['foo', 'bar', 'foo', 'bar',
'foo', 'bar', 'foo', 'foo'],
'B': ['cat', 'dog', 'rat', 'lion',
'bat', 'racoon', 'possum', 'deer'],
'C': np.random.choice(pd.date_range('2017-02-18',
periods=8),
size=8, replace=False)})
print (df)
A B C
0 foo cat 2017-02-19
1 bar dog 2017-02-22
2 foo rat 2017-02-23
3 bar lion 2017-02-20
4 foo bat 2017-02-24
5 bar racoon 2017-02-21
6 foo possum 2017-02-25
7 foo deer 2017-02-18
print (df.dtypes)
A object
B object
C datetime64[ns]
df1 = df.set_index('B')
.groupby('A', sort=False)['C']
.apply(lambda x: pd.Series(x.nsmallest(2).index, index =['B_1','B_2']))
.unstack()
.reset_index()
print (df1)
A B_1 B_2
0 foo deer cat
1 bar lion racoonhttps://stackoverflow.com/questions/42311619
复制相似问题