我想每个月用熊猫数据来计算YTD。在这里,我使用了两个度量,名为sales和sales Rate。对于度量sales,YTD是通过计算累积sum.Code来计算的,如下所示:
report_table['ytd_value'] = report_table.groupby(['financial_year', 'measurement', 'place', 'market', 'product'], sort=False)['value'].cumsum()但是,在测量的情况下,sales rate YTD是以不同的方式计算的。
YTD计算解释(销售比率)如下:
财政年度第一个月(4月) YTD value =财政年度第一个月(4月) value
从财政年度第二个月开始,YTD value按公式计算。
月份YTD value =(4月YTD value(销售)*4月YTD value(销售比率)+(4月value(销售)*4月value(销售比率))/(4月value(销售)+4月value(销售比率))
类似地,对于其他months.Dataframe,下面给出了一个图像。

import pandas as pd
data = {'Month': ['April', 'May', 'April', 'June', 'April', 'May'],
'Year': [2022, 2022, 2022, 2022, 2022, 2022],
'Financial_Year': [2023, 2023, 2023, 2023, 2023, 2023],
'Measurement': ['sales', 'sales', 'sales', 'sales', 'sales rate', 'sales rate'],
'Place': ['Delhi', 'Delhi', 'Delhi', 'Delhi', 'Delhi', 'Delhi'],
'Market': ['Domestic', 'Domestic', 'Export', 'Domestic', 'Domestic', 'Domestic'],
'Product': ['Biscuit', 'Biscuit', 'Chocolate', 'Biscuit', 'Biscuit', 'Biscuit'],
'Value': ['10', '10', '20', '25', '10', '20']}
# Create DataFrame
df = pd.DataFrame(data)
df['Value'] = df['Value'].astype(float)
df['ytd_value'] = df.groupby(['Financial_Year', 'Measurement', 'Place', 'Market', 'Product'], sort=False)['Value'].cumsum()它将计算ytd_value用于sales和sales rate measurement.But,我想以上述格式计算销售速率的ytd_value。
我在下面的代码中尝试过,但是它显示了一个错误:
rslt_df = df[(df['Measurement'] == 'sales')]
df.loc[df['Measurement'] == "sales rate", 'ytd_value'] = (df.groupby(['Financial_Year', 'Measurement', 'Place', 'Market', 'Product'], sort=False)['ytd_value']*rslt_df.groupby(['Financial_Year', 'Measurement', 'Place', 'Market', 'Product'], sort=False)['ytd_value'] + df.groupby(['Financial_Year', 'Measurement', 'Place', 'Market', 'Product'], sort=False)['Value'] * rslt_df.groupby(['Financial_Year', 'Measurement', 'Place', 'Market', 'Product'], sort=False)['Value']) / (rslt_df.groupby(['Financial_Year', 'Measurement', 'Place', 'Market', 'Product'], sort=False)['ytd_value'] + rslt_df.groupby(['Financial_Year', 'Measurement', 'Place', 'Market', 'Product'], sort=False)['Value'])预期产出:
Month Year Financial_Year ... Product Value ytd_value
0 April 2022 2023 ... Biscuit 10.0 10.0
1 May 2022 2023 ... Biscuit 10.0 20.0
2 April 2022 2023 ... Chocolate 20.0 20.0
3 June 2022 2023 ... Biscuit 25.0 45.0
4 April 2022 2023 ... Biscuit 10.0 10.0
5 May 2022 2023 ... Biscuit 20.0 10.0有人能帮我解决这个问题吗?
发布于 2022-09-10 15:25:59
我建议您更改一下数据格式:
Month Year Financial_Year Place Market Product Sales Sales Rate
0 April 2022 2023 Delhi Domestic Biscuit 10.0 10.0
1 May 2022 2023 Delhi Domestic Biscuit 10.0 20.0
2 June 2022 2023 Delhi Domestic Biscuit 25.0 0.0您可以通过每个月的销售值进行汇总,但关键是每个月的销售值和销售比率值都是单一的。
一旦有了这个值,就可以为4月份设置YTD值,然后迭代接下来的几个月来计算它们的值。
我认为在您发布的YTD计算公式中有一个错误,但是使用这个公式,下面是一些示例代码:
import pandas as pd
data = {'Month': ['April', 'May', 'June'],
'Year': [2022, 2022, 2022],
'Financial_Year': [2023, 2023, 2023],
'Place': ['Delhi', 'Delhi', 'Delhi'],
'Market': ['Domestic', 'Domestic', 'Domestic'],
'Product': ['Biscuit', 'Biscuit', 'Biscuit'],
'Sales': [10, 10, 25],
'Sales Rate': [10, 20, 0]}
# Create DataFrame
df = pd.DataFrame(data)
df['Sales'] = df['Sales'].astype(float)
df['Sales Rate'] = df['Sales Rate'].astype(float)
df['YTD'] = 0.0
df.at[0,'YTD'] = df.iloc[0]['Sales']
for rowidx in range(1, len(df)):
prevrow = df.iloc[rowidx - 1]
tmp = prevrow['Sales'] * prevrow['Sales Rate']
df.at[rowidx,'YTD'] = tmp + tmp/tmp
print(df)例如,这将产生以下结果:
Month Year Financial_Year Place Market Product Sales Sales Rate YTD
0 April 2022 2023 Delhi Domestic Biscuit 10.0 10.0 10.0
1 May 2022 2023 Delhi Domestic Biscuit 10.0 20.0 101.0
2 June 2022 2023 Delhi Domestic Biscuit 25.0 0.0 201.0您应该能够使用此示例来实现计算YTD值的正确函数。
https://stackoverflow.com/questions/73669233
复制相似问题