根据这个question,我想在7月份通过id获得项目在总金额中的比例,我使用与问题相同的数据集:
id date num name type price
0 1 7/6/2020 10 pen abcd $1
1 1 7/6/2020 2 abc efg $3
2 1 7/6/2020 3 bcd efg $5
3 2 7/6/2020 3 pen abcd $1
4 2 7/6/2020 1 pencil abcd $3
5 2 7/6/2020 2 disk abcd $1
6 2 7/6/2020 2 paper abcd $1
7 3 7/6/2020 2 ff pag $100
8 3 7/6/2020 10 water kml $5
9 4 7/15/2020 5 gg kml $5
10 4 7/15/2020 10 cofffee oo $5
11 5 7/15/2020 5 pp oo $4
12 6 7/15/2020 2 abc efg $3
13 6 7/15/2020 3 bcd efg $5
14 6 7/15/2020 4 aa efg $5
15 6 7/15/2020 5 bb efg $6
16 7 7/15/2020 1 bag abcd $50
17 7 7/15/2020 1 box abcd $20
18 8 7/15/2020 1 pencil abcd $3
19 8 7/15/2020 2 disk abcd $1
20 8 7/15/2020 2 paper abcd $1
21 8 7/15/2020 2 ff hijk $100
22 9 8/15/2020 10 water kml $5
23 9 8/15/2020 5 gg kml $5
24 9 8/15/2020 10 cofffee oo $5
25 9 8/15/2020 5 pp oo $4
26 9 8/15/2020 2 abc efg $3
27 10 8/15/2020 3 bcd efg $5
28 10 8/15/2020 4 aa efg $5
29 10 8/15/2020 5 bb efg $6
30 11 8/15/2020 1 bag abcd $50
31 11 8/15/2020 1 box abcd $20我想显示每日柱状图的总金额按类型在pyecharts或其他,它类似于

,下面的代码不正确,
import pandas as pd
import xlrd
import seaborn as sns
import matplotlib.pyplot as plt
df = pd.read_excel ('./orders.xlsx', sheet_name='Sheet1')
df.groupby(by=['type']).sum()
df['price'] = df['price'].replace('$','', regex=True).astype(int)
df['new'] = df['price'].mul(df['num'])
df1 = df.groupby(by=['name'], as_index=False)['new'].sum()
# df1
# df1['new'] = df1.apply(lambda x: x.sum(), axis=1)
# df1.loc['new'] = df1.apply(lambda x: x.sum()).dropna()非常感谢你的建议。
发布于 2020-11-27 03:18:03
首先,我建议使用datetime类型来处理日期/时间:
df['date'] = pd.to_datetime(df['date'])现在,为了回答你的问题,如果你只想要7月份的数据,你可以用以下方法提取它:
July_df = df[df['date'].dt.to_period('M')=='2020-07'].copy()您可以继续绘制July_df。
如果想要绘制每个月的图表,可以使用groupby
df‘’total‘=df’‘price’.str.replace(‘$’,‘’).astype(.astype)*df‘’num‘
(df.groupby([pd.Grouper(key='date',freq='M'),'name'])['total'].sum()
.reset_index(level='date')
.groupby('date')
.plot.pie(subplots=True, autopct='%.2f%%')
)你会得到两个这样的图:


如果您迭代groupby,还可以添加更多的格式:
# notice the difference in first groupby
groups = (df.groupby([df.date.dt.strftime('%b-%Y'),'name'])['total'].sum()
.reset_index(level='date')
.groupby('date')
)
fig, axes = plt.subplots(1,2, figsize=(10,5))
for ax, (month, data) in zip(axes, groups):
data['total'].plot.pie(autopct='%.2f%%', ax=ax)
ax.set_title(f'data in {month}')输出:

https://stackoverflow.com/questions/65025576
复制相似问题