我有这个数据:
dff = pd.DataFrame({"country":["cn","cn","cn", "cn","bn","bn","sn"], 'date':['2019-6-6','2019-6-5','2019-7-6','2019-7-9','2017-4-3','2017-3-2','2019-2-9'],'Revenue':[100,200,100,90,300,400,500], "class":["A","C", "B","B", "C","A","B"]})
dff
country date Revenue class
0 cn 2019-6-6 100 A
1 cn 2019-6-5 200 C
2 cn 2019-7-6 100 B
3 cn 2019-7-9 90 B
4 bn 2017-4-3 300 C
5 bn 2017-3-2 400 A
6 sn 2019-2-9 500 B我想实现三个特点:
Jun-2019
2019-6-6修改为月份2019-6-6收入按月份类别分类
结果表应该如下所示:
country date A B C
0 cn Jun-2019 100 0 200
1 cn Jul-2019 0 190 0
2 bn Apl-2017 0 0 300
3 bn Mar-2017 400 0 0
4 sn Feb-2019 0 500 0为了实现功能1,
dff.date = pd.to_datetime(dff.date )
dff.date = dff.date.apply(lambda x : x.strftime("%b-%Y"))
dff
country date Revenue class
0 cn Jun-2019 100 A
1 cn Jun-2019 200 C
2 cn Jul-2019 100 B
3 cn Jul-2019 90 B
4 bn Apr-2017 300 C
5 bn Mar-2017 400 A
6 sn Feb-2019 500 B为了实现功能2,
dff1= dff.groupby(["date"]).sum()
dff1
Revenue
date
Apr-2017 300
Feb-2019 500
Jul-2019 190
Jun-2019 300
Mar-2017 400为了实现功能3,
dff.pivot( columns='class')['Revenue']
class A B C
0 100.0 NaN NaN
1 NaN NaN 200.0
2 NaN 100.0 NaN
3 NaN 90.0 NaN
4 NaN NaN 300.0
5 400.0 NaN NaN
6 NaN 500.0 NaN我觉得我快到了,但我不知道如何同时把它结合起来,才能达到最终的效果。任何想法都是值得赞赏的!
发布于 2020-02-10 15:55:35
您可以在转换pivot_table之后执行date
df.pivot_table(index=['country','date'],
columns='class',
values='Revenue',
aggfunc='sum',
fill_value=0)输出:
class A B C
country date
bn Apr-2017 0 0 300
Mar-2017 400 0 0
cn Jul-2019 0 190 0
Jun-2019 100 0 200
sn Feb-2019 0 500 0发布于 2020-02-10 15:53:56
IIUC
df=dff.groupby([dff.country,dff.date.dt.strftime('%b-%Y'),dff['class']]).sum()['Revenue'].\
unstack(fill_value=0)
df# df=df.reset_index()
Out[13]:
class A B C
country date
bn Apr-2017 0 0 300
Mar-2017 400 0 0
cn Jul-2019 0 190 0
Jun-2019 100 0 200
sn Feb-2019 0 500 0或
pd.crosstab(index=[dff.country,dff.date.dt.strftime('%b-%Y')],columns=dff['class'], values=dff['Revenue'], aggfunc='sum').fillna(0)
Out[20]:
class A B C
country date
bn Apr-2017 0.0 0.0 300.0
Mar-2017 400.0 0.0 0.0
cn Jul-2019 0.0 190.0 0.0
Jun-2019 100.0 0.0 200.0
sn Feb-2019 0.0 500.0 0.0https://stackoverflow.com/questions/60153872
复制相似问题