很抱歉,如果这个问题令人困惑,我不知道该如何表达。如果这是重复的问题,请让我知道。
我有一个groupby对象,如下所示:
us.groupby(['category_id', 'title']).sum()[['views']]
us
category_id title views
Autos & Vehicle 1980 toyota corolla liftback commercial 13061
1992 Chevy Lumina Euro commercial 18470406
2019 Chevrolet Silverado First Look 13061
Music Backyard Boys 133
Eminem - Song 1223
Cardi B - Wap 1111122
Travel & Events Welcome to Winter PUNderland 437576
What Spring Looks Like Around The World 17554672并且我只想获得每个类别的最大值,例如:
category_id title views
Autos & Vehicle 1992 Chevy Lumina Euro commercial 18470406
Music Cardi B - Wap 1111122
Travel & Events What Spring Looks Like Around The World 17554672我该怎么做呢?
我尝试了.first()方法,也尝试了类似这样的us.groupby(['category_id', 'title']).sum()[['views']].sort_values(by='views', ascending=False)[:1],但它只给出了整个数据帧的第一行。有没有什么函数可以只过滤groupby对象的最大值?
谢谢!
发布于 2020-09-23 10:48:11
您可以尝试:
us_group = us.groupby(['category_id', 'title']).sum()[['views']]
(us_group.reset_index().sort_values(['views'])
.drop_duplicates('category_id', keep='last')
)https://stackoverflow.com/questions/64020290
复制相似问题