我的数据记录如下所示
category dt userid amt
1 4/14/2019 1 140
1 5/1/2019 1 500
2 5/5/2019 1 300
3 5/19/2019 1 230
2 6/17/2019 1 200
4 6/18/2019 1 400
1 7/30/2019 1 400
1 8/17/2019 1 300
2 12/2/2019 1 200
2 12/23/2019 1 500
1 1/10/2019 2 470
1 2/25/2019 2 450
2 10/4/2019 2 350Q1:我如何计算每年每个月的平均值?
user month1 month2 month3 month4 month5 month6 month7 month8 month9 month10 month11 month12 avg_all_month
1 0 0 0 140 343.33 300 400 300 0 0 0 350 305.55
2 470 450 0 0 0 0 0 0 0 350 0 0 423.33Q2:如何为每个类别的事务计数
user pro_cat1 pro_cat2 pro_cat3 pro_cat4 total_product
1 4 3 1 1 7
2 2 1 0 0 3发布于 2020-05-16 19:21:37
如果是同一年,您可以将DataFrame.pivot_table与DataFrame.reindex一起使用,将DataFrame.add_prefix与mean per all months一起使用:
df['dt'] = pd.to_datetime(df['dt'])
df2 = (df.pivot_table(index='userid',
columns=df['dt'].dt.month,
values='amt',
aggfunc='mean',
fill_value=0)
.reindex(range(1, 13), axis=1, fill_value=0)
.add_prefix('month')
.assign(avg_all_month = lambda x: df.groupby('userid')['amt'].mean())
.reset_index()
.rename_axis(None, axis=1))
print (df2)
userid month1 month2 month3 month4 month5 month6 month7 month8 \
0 1 0 0 0 140 343.333333 300 400 300
1 2 470 450 0 0 0.000000 0 0 0
month9 month10 month11 month12 avg_all_month
0 0 0 0 350 317.000000
1 0 350 0 0 423.333333 然后使用带有sum的crosstab作为第二个
df3 = (pd.crosstab(df['userid'],
df['category'])
.add_prefix('pro_')
.assign(total_product = lambda x: x.sum(axis=1))
.reset_index()
.rename_axis(None, axis=1)
)
print (df3)
userid pro_1 pro_2 pro_3 pro_4 total_product
0 1 4 4 1 1 10
1 2 2 1 0 0 3https://stackoverflow.com/questions/61835927
复制相似问题