首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >熊猫按行将绝对值转换为多组百分比

熊猫按行将绝对值转换为多组百分比
EN

Stack Overflow用户
提问于 2022-03-10 21:57:05
回答 1查看 189关注 0票数 0

我有一个长格式的df,其中包含一个包含3个不同级别日期、国家和组的总量列(绝对值)。

代码语言:javascript
复制
import pandas as pd
/*
* 提示:该行代码过长,系统自动注释不进行高亮。一键复制会移除系统注释 
* df  = pd.DataFrame.from_dict([{ "date": "2022-02", "country": "Serbia", "group": 3, "amount": 33948 }, { "date": "2021-05", "country": "Thailand", "group": 3, "amount": 15857 }, { "date": "2021-05", "country": "Russia", "group": 2, "amount": 42855 }, { "date": "2021-11", "country": "Ukraine", "group": 3, "amount": 57306 }, { "date": "2021-06", "country": "Poland", "group": 2, "amount": 52898 }, { "date": "2022-02", "country": "Indonesia", "group": 3, "amount": 32330 }, { "date": "2021-11", "country": "Indonesia", "group": 1, "amount": 33791 }, { "date": "2022-02", "country": "China", "group": 3, "amount": 45050 }, { "date": "2021-12", "country": "Indonesia", "group": 1, "amount": 13865 }, { "date": "2022-03", "country": "Sweden", "group": 1, "amount": 45039 }, { "date": "2021-05", "country": "Colombia", "group": 3, "amount": 9363 }, { "date": "2022-01", "country": "Bangladesh", "group": 1, "amount": 47121 }, { "date": "2022-02", "country": "Indonesia", "group": 2, "amount": 18855 }, { "date": "2021-05", "country": "China", "group": 1, "amount": 49383 }, { "date": "2021-06", "country": "Turkmenistan", "group": 3, "amount": 61386 }, { "date": "2021-09", "country": "Kenya", "group": 3, "amount": 40434 }, { "date": "2022-03", "country": "Nicaragua", "group": 3, "amount": 3801 }, { "date": "2022-02", "country": "China", "group": 1, "amount": 39416 }, { "date": "2022-03", "country": "Brazil", "group": 1, "amount": 13657 }, { "date": "2021-05", "country": "Colombia", "group": 2, "amount": 23473 }, { "date": "2022-02", "country": "China", "group": 3, "amount": 19742 }, { "date": "2021-08", "country": "Russia", "group": 2, "amount": 45098 }, { "date": "2022-01", "country": "China", "group": 3, "amount": 15158 }, { "date": "2021-08", "country": "China", "group": 3, "amount": 18376 }, { "date": "2022-01", "country": "Slovenia", "group": 2, "amount": 71213 }, { "date": "2022-02", "country": "Czech Republic", "group": 2, "amount": 32744 }, { "date": "2021-06", "country": "Netherlands", "group": 1, "amount": 42706 }, { "date": "2021-07", "country": "China", "group": 2, "amount": 40277 }, { "date": "2021-08", "country": "United States", "group": 2, "amount": 3070 }, { "date": "2021-07", "country": "Germany", "group": 3, "amount": 17039 }, { "date": "2021-12", "country": "China", "group": 2, "amount": 8714 }, { "date": "2022-01", "country": "Malta", "group": 2, "amount": 44230 }, { "date": "2022-01", "country": "Russia", "group": 3, "amount": 33626 }, { "date": "2021-09", "country": "Greece", "group": 2, "amount": 72860 }, { "date": "2021-08", "country": "China", "group": 1, "amount": 59254 }, { "date": "2022-01", "country": "Japan", "group": 3, "amount": 18136 }, { "date": "2021-08", "country": "Venezuela", "group": 2, "amount": 14065 }, { "date": "2022-01", "country": "China", "group": 2, "amount": 36930 }, { "date": "2022-01", "country": "Honduras", "group": 2, "amount": 768 }, { "date": "2021-08", "country": "Vietnam", "group": 2, "amount": 33652 }, { "date": "2021-07", "country": "Ukraine", "group": 2, "amount": 54050 }, { "date": "2021-09", "country": "Indonesia", "group": 2, "amount": 50304 }, { "date": "2021-10", "country": "Peru", "group": 1, "amount": 27157 }, { "date": "2021-08", "country": "Brazil", "group": 3, "amount": 15869 }, { "date": "2021-11", "country": "Sweden", "group": 1, "amount": 32451 }, { "date": "2021-12", "country": "Mozambique", "group": 2, "amount": 29659 }, { "date": "2022-01", "country": "Argentina", "group": 2, "amount": 25282 }, { "date": "2021-06", "country": "Mongolia", "group": 2, "amount": 63027 }, { "date": "2021-07", "country": "Sudan", "group": 2, "amount": 5006 }, { "date": "2021-08", "country": "United States", "group": 2, "amount": 73414 }, { "date": "2021-05", "country": "China", "group": 3, "amount": 34759 }, { "date": "2021-12", "country": "Brazil", "group": 1, "amount": 636 }, { "date": "2021-06", "country": "Philippines", "group": 2, "amount": 59227 }, { "date": "2021-10", "country": "Russia", "group": 1, "amount": 28537 }, { "date": "2021-08", "country": "China", "group": 3, "amount": 23460 }, { "date": "2022-02", "country": "Philippines", "group": 2, "amount": 62968 }, { "date": "2021-10", "country": "Ukraine", "group": 3, "amount": 63908 }, { "date": "2021-10", "country": "Ukraine", "group": 3, "amount": 38263 }, { "date": "2021-06", "country": "Botswana", "group": 1, "amount": 15918 }, { "date": "2022-02", "country": "Russia", "group": 1, "amount": 31156 }, { "date": "2021-07", "country": "France", "group": 3, "amount": 64077 }, { "date": "2021-07", "country": "China", "group": 1, "amount": 18932 }, { "date": "2022-02", "country": "Russia", "group": 1, "amount": 45279 }, { "date": "2021-07", "country": "Russia", "group": 1, "amount": 7849 }, { "date": "2021-09", "country": "China", "group": 1, "amount": 52640 }, { "date": "2021-07", "country": "Peru", "group": 2, "amount": 19369 }, { "date": "2021-07", "country": "Greece", "group": 1, "amount": 20489 }, { "date": "2021-11", "country": "China", "group": 3, "amount": 30177 }, { "date": "2021-07", "country": "Portugal", "group": 1, "amount": 69521 }, { "date": "2021-06", "country": "Thailand", "group": 3, "amount": 17341 }, { "date": "2021-12", "country": "Peru", "group": 3, "amount": 27012 }, { "date": "2021-12", "country": "Afghanistan", "group": 1, "amount": 34146 }, { "date": "2021-11", "country": "Indonesia", "group": 1, "amount": 57619 }, { "date": "2021-05", "country": "Portugal", "group": 1, "amount": 37319 }, { "date": "2022-01", "country": "Denmark", "group": 1, "amount": 18370 }, { "date": "2022-01", "country": "United States", "group": 3, "amount": 4690 }, { "date": "2021-12", "country": "China", "group": 1, "amount": 35333 }, { "date": "2021-10", "country": "Indonesia", "group": 3, "amount": 74285 }, { "date": "2021-09", "country": "Mexico", "group": 1, "amount": 11260 }, { "date": "2021-11", "country": "Ukraine", "group": 3, "amount": 44389 }, { "date": "2021-11", "country": "France", "group": 3, "amount": 29432 }, { "date": "2021-08", "country": "Ecuador", "group": 1, "amount": 24529 }, { "date": "2021-08", "country": "Democratic Republic of the Congo", "group": 1, "amount": 5211 }, { "date": "2021-12", "country": "Georgia", "group": 3, "amount": 54164 }, { "date": "2021-05", "country": "France", "group": 2, "amount": 9046 }, { "date": "2021-05", "country": "Sweden", "group": 1, "amount": 10326 }, { "date": "2022-02", "country": "Madagascar", "group": 1, "amount": 70109 }, { "date": "2022-01", "country": "China", "group": 1, "amount": 25702 }, { "date": "2021-09", "country": "Poland", "group": 2, "amount": 46625 }, { "date": "2022-01", "country": "Czech Republic", "group": 1, "amount": 23806 }, { "date": "2021-06", "country": "Poland", "group": 2, "amount": 63310 }, { "date": "2021-11", "country": "Poland", "group": 3, "amount": 56290 }, { "date": "2021-12", "country": "Russia", "group": 3, "amount": 45846 }, { "date": "2021-09", "country": "Sweden", "group": 3, "amount": 26358 }, { "date": "2021-09", "country": "Colombia", "group": 2, "amount": 14682 }, { "date": "2021-11", "country": "China", "group": 1, "amount": 65021 }, { "date": "2022-02", "country": "Peru", "group": 1, "amount": 29406 }, { "date": "2022-01", "country": "China", "group": 1, "amount": 57333 }, { "date": "2021-05", "country": "Philippines", "group": 2, "amount": 28340 }, { "date": "2021-10", "country": "Japan", "group": 2, "amount": 37300 }])
*/

示例df

代码语言:javascript
复制
    date    country  group  amount
0   2022-02 Serbia   3      33948
1   2021-05 Thailand 3      15857
2   2021-05 Russia   2      42855
3   2021-11 Ukraine  3      57306
4   2021-06 Poland   2      52898
...

日期可以是任何yyyy-mm,国家可以是任何国家,组可以是123

我想做的是按日期和国家分组,然后为每个组计算分组日期和国家的相对百分比。

例如,以上面的原始df为例,得到类似于(例如一个日期和一个国家)的信息:

代码语言:javascript
复制
date    country  group  amount
2022-02 Serbia   1      33948
                 2      34567
                 3      96787

然后将金额换算为百分比:

代码语言:javascript
复制
date    country  group  amount_percentage
2022-02 Serbia   1      20.5
                 2      20.9
                 3      58.6

在恢复到所有日期和国家的原始格式之前:

代码语言:javascript
复制
    date    country  group  amount_percentage
0   2022-02 Serbia   1      20.5
1   2022-02 Serbia   2      20.9
2   2022-02 Serbia   3      58.6
...

我目前处理这一问题的方法是:

代码语言:javascript
复制
df.groupby(['date', 'country', 'group'])['amount'].sum().unstack()

这给了我分组的日期和国家栏,而组变成了相对数量的列。

代码语言:javascript
复制
date    country  1     2     3
2022-02 Serbia   33948 34567 96787  
        USA      23457 67589 23456  
...

但是,我不知道如何将这些转换为行总数的百分比,然后再将dataframe转换回最终格式。最后这部分你会怎么说?

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2022-03-10 22:02:31

如果我正确理解,您可以执行groupby,然后在amount上执行transform('sum'),并将amount除以如下:

代码语言:javascript
复制
df['amount_percentage'] = df['amount'] / df.groupby(['date', 'country'], sort=False)['amount'].transform('sum') * 100

输出:

代码语言:javascript
复制
>>> df
       date      country  group  amount  amount_percentage
0   2022-02       Serbia      3   33948         100.000000
1   2021-05     Thailand      3   15857         100.000000
2   2021-05       Russia      2   42855         100.000000
3   2021-11      Ukraine      3   57306          56.350853
4   2021-06       Poland      2   52898          45.520102
..      ...          ...    ...     ...                ...
95  2021-11        China      1   65021          68.300805
96  2022-02         Peru      1   29406         100.000000
97  2022-01        China      1   57333          42.430230
98  2021-05  Philippines      2   28340         100.000000
99  2021-10        Japan      2   37300         100.000000
票数 2
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/71431266

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档