我有这样的数据:
df = pd.DataFrame(
{
"day": ["Mon", "Mon", "Mon", "Mon", "Tues", "Tues", "Tues", "Tues"],
"name": ["James", "Rover", "Cleo", "X", "Bran", "Excaliber", "Henry", "Mia"],
"species": ['dog', 'dog', 'cat', 'cat', 'dog', 'dog', 'cat', 'cat'],
"fleas": [0, 1, 2, 4, 6, 7, 8, 3],
}
)
>>> df
day name species fleas
0 Mon James dog 0
1 Mon Rover dog 1
2 Mon Cleo cat 2
3 Mon X cat 4
4 Tues Bran dog 6
5 Tues Excaliber dog 7
6 Tues Henry cat 8
7 Tues Mia cat 3每一行对应于某一特定一天某一特定动物身上有多少跳蚤的测量值。我想做的是按日分组,然后把狗身上所有的跳蚤加起来,然后创建一个新的列dog_fleas,并给出结果。此操作的结果应该如下所示
day name species fleas dog_fleas
0 Mon James dog 0 1
1 Mon Rover dog 1 1
2 Mon Cleo cat 2 1
3 Mon X cat 4 1
4 Tues Bran dog 6 13
5 Tues Excaliber dog 7 13
6 Tues Henry cat 8 13
7 Tues Mia cat 3 13在潘达斯我该怎么做?
发布于 2022-09-02 02:36:42
让我们试试transform和where
df['new'] = df['fleas'].where(df['species']=='dog').groupby(df['day']).transform('sum')
df
Out[88]:
day name species fleas new
0 Mon James dog 0 1.0
1 Mon Rover dog 1 1.0
2 Mon Cleo cat 2 1.0
3 Mon X cat 4 1.0
4 Tues Bran dog 6 13.0
5 Tues Excaliber dog 7 13.0
6 Tues Henry cat 8 13.0
7 Tues Mia cat 3 13.0发布于 2022-09-02 02:34:50
按day分组,计算那天狗身上跳蚤的总和:
def f(df):
return pd.Series(df.loc[df['species'].eq('dog'), 'fleas'].sum(),
index=df.index)
df['dog_fleas'] = df.groupby('day', group_keys=False).apply(f) day name species fleas dog_fleas
0 Mon James dog 0 1
1 Mon Rover dog 1 1
2 Mon Cleo cat 2 1
3 Mon X cat 4 1
4 Tues Bran dog 6 13
5 Tues Excaliber dog 7 13
6 Tues Henry cat 8 13
7 Tues Mia cat 3 13发布于 2022-09-02 02:30:16
Groupby day列,并应用一个函数来计数dogs,转换为dataframe,然后在day列上合并。
>>> df.merge(
df
.groupby('day')
.apply(lambda x: x.loc[x['species'].eq('dog'), 'fleas'].sum())
.to_frame('dog_fleas'),
on='day')
day name species fleas dog_fleas
0 Mon James dog 0 1
1 Mon Rover dog 1 1
2 Mon Cleo cat 2 1
3 Mon X cat 4 1
4 Tues Bran dog 6 13
5 Tues Excaliber dog 7 13
6 Tues Henry cat 8 13
7 Tues Mia cat 3 13https://stackoverflow.com/questions/73577198
复制相似问题