我有这样的数据:
Thing quarter num_col1 num_col2
aaa 2010Q1 1.3 99.76
bbb 2010Q1 11.3 109.76
ccc 2010Q1 91.3 119.76
.....
.....
aaa 2019Q4 21.3 119.76
bbb 2019Q4 41.3 299.76
ccc 2019Q4 201.3 199.76我需要按Thing列分组,并计算所有季度的num_col1和num_col2列的移动平均值。
到目前为止,我一直在尝试这样做:
## define moving-average function
N = 2
def pandas_rolling(x):
return pd.Series.rolling(x, window=N).mean()
## now group-by and calculate moving averages
things_groupby = df.groupby(by=['Thing'])
## below lines are giving incorrect values
df.loc[:,'num_col1_SMA'] = (things_groupby['num_col1'].apply(pandas_rolling)).values
df.loc[:,'num_col2_SMA'] = (things_groupby['num_col2'].apply(pandas_rolling)).values但是,当我手动处理Thing列中的一个独特之处时,如下图所示,它会给出预期的结果。
pandas_rolling(df.loc[df.loc[:,'Topic']=='aaa'].loc[:,'num_col1']).values我在计算单个组的移动平均值,然后在数据中填充它们时,我做了什么错事呢?我该如何正确地做这件事?
发布于 2021-12-08 10:34:07
您可以删除values
df['num_col1_SMA'] = things_groupby['num_col1'].apply(pandas_rolling)
df['num_col2_SMA'] = things_groupby['num_col2'].apply(pandas_rolling)或者:
df[['num_col1_SMA', 'num_col2_SMA']] = (things_groupby[['num_col1','num_col2']]
.apply(pandas_rolling))如果可能的话,没有groupby.apply是必要的,删除MultiIndex的第一级
df[['num_col1_SMA', 'num_col2_SMA']] = (things_groupby[['num_col1','num_col2']]
.rolling(window=N)
.mean()
.droplevel(0))https://stackoverflow.com/questions/70273377
复制相似问题