我有一个数据帧,看起来像这样:
| PACKAGES SHIPPED | PACKAGES TRANSFERRED |
Product & Quantity | Apple-5 pk | Apple-5 pk | Apple-5 pk | Apple-5pk |
Store Branch I.D. | 34234324 | 34235555 | 34234324 | 34235555 |
----------------------------------------------------------------------------
Period Week
5/14 - 5/20 | 5 | 10 | 20 | 7 |
5/21 - 5/27 | 40 | X | 1 | Y |这个Dataframe有"packages shipped“的多列标题,其中有许多商店分支都会有”packages shipped“。
如果我想对每个周期周的特定"Product & Quantity“值和特定"Store & Branch ID”的"Packages Shipped“和"Packages Transferred”求和,最有效的方法是什么?
理想情况下,结果数据帧应该是:
|Sum Shipped & Transferred|Sum Shipped & Transferred |
Product & Quantity | Apple-5 pk | Apple-10 pk |
Store Branch I.D. | 34234324 | 34235555 | 34234324 | 34235555 |
----------------------------------------------------------------------------
Period Week
5/14 - 5/20 | 25 | 17 | 40 | 234 |
5/21 - 5/27 | 41 | X+Y | 34 | 25 |发布于 2019-06-25 04:02:49
考虑将其表示为数据帧而不是图片可能会有所帮助。这里有一个简单的方法来考虑你的问题。当然,如果您真的使用图片中所示的多列索引来存储数据,那么这将不会有任何帮助。
In [33]: df = pd.DataFrame({'Period Week':['5/14 - 5/20','5/21 - 5/27','5/14 - 5/20','5/21 - 5/27'],'Transaction':['Shi
...: pped','Shipped','Transfered','Transfered'],'Package SKU':['Apples-5k','Apples-10k','Apples-5k','Apples-10k'],'
...: Quantity':[5,10,20,7]})
In [34]: df
Out[34]:
Period Week Transaction Package SKU Quantity
0 5/14 - 5/20 Shipped Apples-5k 5
1 5/21 - 5/27 Shipped Apples-10k 10
2 5/14 - 5/20 Transfered Apples-5k 20
3 5/21 - 5/27 Transfered Apples-10k 7然后将索引设置为多列:
df.set_index(['Period Week','Transaction','Package SKU'])最后,groupby和calc
In [35]: df.groupby(['Period Week','Package SKU'])['Quantity'].count()
Out[35]:
Period Week Package SKU
5/14 - 5/20 Apples-5k 2
5/21 - 5/27 Apples-10k 2
Name: Quantity, dtype: int64https://stackoverflow.com/questions/56742532
复制相似问题