我的数据如下:
df = pd.DataFrame(
{
"Mall": ["A", "A", "B", "B", "C", "C"],
"Category": ["Milk", "Egg", "Beef", "Egg", "Orange", "Egg"],
"Price": [5, 10, 15, 9, 7, 11]
},
columns=["Mall", "Category", "Price"],
)
df
>>>
Mall Category Price
0 A Milk 5
1 A Egg 10
2 B Beef 15
3 B Egg 9
4 C Orange 7
5 C Egg 11对于每一个商场和类别,我想比较一下鸡蛋的价格差异。
在这种情况下:
商城牛奶价格A-A商城鸡蛋价格
商城鸡蛋价格A-A商城鸡蛋价格
B商城牛肉价格-B商城鸡蛋价格
B商城鸡蛋价格-B商城鸡蛋价格
C商城橘子价格C-C商城鸡蛋价格
C商城鸡蛋价格C-C商城鸡蛋价格
预期成果:
Mall Category Price Price_diff
0 A Milk 5 -5
1 A Egg 10 0
2 B Beef 15 6
3 B Egg 9 0
4 C Orange 7 -4
5 C Egg 11 0我该怎么办?
发布于 2020-12-23 08:45:43
Egg将Price中的非Egg值与Series.where中的缺失值进行比较,然后使用GroupBy.transform和GroupBy.first进行重复egg price,最后用Series.sub减去Price
df['Price_diff'] = (df['Price'].sub(df['Price'].where(df['Category'].eq('Egg'))
.groupby(df['Mall']).transform('first')))
print (df)
Mall Category Price Price_diff
0 A Milk 5 -5.0
1 A Egg 10 0.0
2 B Beef 15 6.0
3 B Egg 9 0.0
4 C Orange 7 -4.0
5 C Egg 11 0.0如果Egg始终是每个组的第二个值,并且每个组只包含2个值:
df['Price_diff'] = df.groupby('Mall')['Price'].diff(-1).fillna(0)
print (df)
Mall Category Price Price_diff
0 A Milk 5 -5.0
1 A Egg 10 0.0
2 B Beef 15 6.0
3 B Egg 9 0.0
4 C Orange 7 -4.0
5 C Egg 11 0.0发布于 2020-12-23 08:49:54
将groupby与transform结合使用
df['Price_diff'] = df.groupby('Mall')['Price'].transform(lambda x: [x.tolist()[0] - x.tolist()[1], 0])输出:
Mall Category Price Price_diff
0 A Milk 5 -5
1 A Egg 10 0
2 B Beef 15 6
3 B Egg 9 0
4 C Orange 7 -4
5 C Egg 11 0https://stackoverflow.com/questions/65421343
复制相似问题