我有以下df
>In [260]: df
>Out[260]:
size market vegetable confirm availability
0 Large ABC Tomato NaN
1 Large XYZ Tomato NaN
2 Small ABC Tomato NaN
3 Large ABC Onion NaN
4 Small ABC Onion NaN
5 Small XYZ Onion NaN
6 Small XYZ Onion NaN
7 Small XYZ Cabbage NaN
8 Large XYZ Cabbage NaN
9 Small ABC Cabbage NaN1)如何获取个数最大的蔬菜的大小?
我使用groupby对蔬菜和大小进行操作,以获得以下df,但我需要获取包含带有蔬菜的最大大小计数的行
In [262]: df.groupby(['vegetable','size']).count()
Out[262]: market confirm availability
vegetable size
Cabbage Large 1 0
Small 2 0
Onion Large 1 0
Small 3 0
Tomato Large 2 0
Small 1 0
df2['vegetable','size'] = df.groupby(['vegetable','size']).count().apply( some logic )所需的Df:
vegetable size max_count
0 Cabbage Small 2
1 Onion Small 3
2 Tomato Large 22)现在我可以说“小卷心菜”在df有大量的供应。所以我需要用small填充所有卷心菜行的确认可用性列,该怎么做呢?
size market vegetable confirm availability
0 Large ABC Tomato Large
1 Large XYZ Tomato Large
2 Small ABC Tomato Large
3 Large ABC Onion Small
4 Small ABC Onion Small
5 Small XYZ Onion Small
6 Small XYZ Onion Small
7 Small XYZ Cabbage Small
8 Large XYZ Cabbage Small
9 Small ABC Cabbage Small发布于 2018-09-09 18:48:32
1)
required_df = veg_df.groupby(['vegetable','size'], as_index=False)['market'].count()\
.sort_values(by=['vegetable', 'market'])\
.drop_duplicates(subset='vegetable', keep='last')2)
merged_df = veg_df.merge(required_df, on='vegetable')
cols = ['size_x', 'market_x', 'vegetable', 'size_y']
dict_renaming_cols = {'size_x': 'size',
'market_x': 'market',
'size_y': 'confirm_availability'}
merged_df = merged_df.loc[:,cols].rename(columns=dict_renaming_cols)发布于 2018-09-09 19:02:29
您可以使用count进行GroupBy,然后对重复项进行排序和删除:
res = df.groupby(['size', 'vegetable'], as_index=False)['market'].count()\
.sort_values('market', ascending=False)\
.drop_duplicates('vegetable')
print(res)
size vegetable market
4 Small Onion 3
2 Large Tomato 2
3 Small Cabbage 2发布于 2018-09-09 18:29:29
您可以将分组的数据帧分配给另一个对象,然后您可以对索引进行其他分组,以获得所需的最大值
d = df.groupby(['vegetable','size']).count()
d.groupby(d.index.get_level_values(0).tolist()).apply(lambda x:x[x.confirm == x.confirm.max()])输出:
market confirm availability
vegetable size
Cabbage Cabbage Small 2 2 0
Onion Onion Small 3 3 0
Tomato Tomato Large 2 2 0https://stackoverflow.com/questions/52243060
复制相似问题