我的df:
Test_Data = [('Client', ['A', 'A', 'A', 'B', 'B', 'B','C','C']),
('Currency', ['USD', 'EUR', 'USD', 'AUD', 'EUR', 'USD', 'GBP', 'USD']),
('SalesPerson', ['Dave', 'Dave', 'Bob', 'Dave', 'Dave', 'Bob','Dave','Bob']),
('Done_Trades', [1,1,2,3,3,4,5,6]),
('Average_Qty', [10, 50, 100, 10, 50, 1000, 50, 100]),
('Average_Qty_CAD', [1, 2, 3, 30,20, 10,1,2])
df = pd.DataFrame(dict(Test_Data))
print(df)
Client Currency SalesPerson Done_Trades Average_Qty Average_Qty_CAD
0 A USD Dave 1 10 1
1 A EUR Dave 1 50 2
2 A USD Bob 2 100 3
3 B AUD Dave 3 10 30
4 B EUR Dave 3 50 20
5 B USD Bob 4 1000 10
6 C GBP Dave 5 50 1
7 C USD Bob 6 100 2a.客户端B的每个客户端(60)的求和Average_Qty_CAD最高,因此首先显示A (6),然后是C (3)。
在b中,Dave有最高的Average_Qty_CAD (30),然后是第二个(20),Bob有第三个(10),所以我们希望B的行排序为30,20,10。
在A中,Bob有最高的Average_Qty_CAD (3),然后是Dave的两个条目(2,1),所以我们需要A订购3,2,1。
在C中,鲍勃的Average_Qty_CAD最高(1002 ),其次是戴夫(1),所以是2,1。
另外,如果我希望每个客户端的和Average_Qty_CAD显示,需要添加什么?
想要的df:
Client Currency SalesPerson Done_Trades Average_Qty Average_Qty_CAD Total per Client
2 B AUD Dave 3 10 30 60
1 B EUR Dave 3 50 20 60
0 B USD Bob 4 1000 10 60
3 A USD Bob 2 100 3 6
4 A EUR Dave 1 50 2 6
5 A USD Dave 1 10 1 6
6 C USD Bob 6 100 2 3
7 C GBP Dave 5 50 1 3发布于 2019-05-18 10:23:42
将GroupBy.transform与sum一起使用,然后使用DataFrame.sort_values
df['Total per Client'] = df.groupby('Client')["Average_Qty_CAD"].transform('sum')
df = (df.sort_values(by=["Total per Client", "Client", "Average_Qty_CAD"],
ascending=[False, True, False]))
print (df)
Client Currency SalesPerson Done_Trades Average_Qty Average_Qty_CAD \
3 B AUD Dave 3 10 30
4 B EUR Dave 3 50 20
5 B USD Bob 4 1000 10
2 A USD Bob 2 100 3
1 A EUR Dave 1 50 2
0 A USD Dave 1 10 1
7 C USD Bob 6 100 2
6 C GBP Dave 5 50 1
Total per Client
3 60
4 60
5 60
2 6
1 6
0 6
7 3
6 3 发布于 2019-05-16 09:38:11
你可以这样做:
m=(df.reindex(df.groupby('Client').Qty_CAD.transform(sum).
sort_values(ascending=False).index).reset_index(drop=True))
print(m) Client SalesPerson Qty_CAD
0 B Bob 10000
1 B Dave 50
2 B Dave 10
3 A Bob 100
4 A Dave 50
5 A Dave 10
6 C Bob 100
7 C Dave 50发布于 2019-05-18 09:36:35
这做了工作,没有我想要的那么优雅:
# Get Totals per client
aux = df.groupby('Client')["Average_Qty_CAD"].sum().rename("Total per Client").reset_index()
print(aux)
# Merge Totals per client with original df and sort
# In case of tie, you want to sort by ascending client
m = df.merge(aux).sort_values(by=["Total per Client", "Client", "Average_Qty_CAD"], ascending=[False, True, False])
print(m)https://stackoverflow.com/questions/56165334
复制相似问题