# Price
0 1.00
1 12.23
2 3.24
3 12.67
6 149.98
7 19.98
8 1883.23
9 1.99
10 4.89
11 9.99
12 12.99
13 18.23
14 17.99
15 18.98
16 18.11
17 19.10
18 20.30
19 1901.30
20 20.27k假设我有以前的数据。我想添加两个列,mean_a和mean_b。mean_a将计算下一个k级别的平均值,mean_b将计算前一个k级别的平均值。例如,在#10与k=3,mean_a = (4.89 + 9.99 + 12.99)/3 = 9.29和mean_b = (4.89 + 1.99 + 1883.23)/3 = 630.0366667。我如何在python中实现这一点?
我试过了,但我觉得不好
def moving_average(self, df, col_name='smooth_midprice', k=10):
ma_cols = []
mb_cols = []
temp_df = pd.DataFrame()
for i in range(0, k+1):
ma_col = 'M_A_{}'.format(i)
ma_cols.append(ma_col)
mb_col = 'M_B_{}'.format(i)
mb_cols.append(mb_col)
temp_df[ma_col] = df[col_name].shift(i)
temp_df[mb_col] = df[col_name].shift(-i)
df['M_A'] = temp_df[ma_cols].mean(axis=1, skipna=True, numeric_only=True)
df['M_B'] = temp_df[mb_cols].mean(axis=1, skipna=True, numeric_only=True)
return df发布于 2018-06-10 16:41:46
rolling (注意,.iloc是反转df的顺序)
df['mean_a'] = df.Price.rolling(3,min_periods =1).mean()
df['mean_b'] = df.Price.iloc[::-1].rolling(3,min_periods =1).mean()
df
Out[9]:
Price mean_a mean_b
0 1.00 1.000000 5.490000
1 12.23 6.615000 9.380000
2 3.24 5.490000 55.296667
3 12.67 9.380000 60.876667
6 149.98 55.296667 684.396667
7 19.98 60.876667 635.066667
8 1883.23 684.396667 630.036667
9 1.99 635.066667 5.623333
10 4.89 630.036667 9.290000
11 9.99 5.623333 13.736667
12 12.99 9.290000 16.403333
13 18.23 13.736667 18.400000
14 17.99 16.403333 18.360000
15 18.98 18.400000 18.730000
16 18.11 18.360000 19.170000
17 19.10 18.730000 646.900000
18 20.30 19.170000 647.290000
19 1901.30 646.900000 960.785000
20 20.27 647.290000 20.270000修复您的代码
col_name='Price'
k=10
ma_cols = []
mb_cols = []
temp_df = pd.DataFrame()
for i in range(0, k + 1):
ma_col = 'M_A_{}'.format(i)
ma_cols.append(ma_col)
mb_col = 'M_B_{}'.format(i)
mb_cols.append(mb_col)
temp_df[ma_col] = df[col_name].shift(i)
temp_df[mb_col] = df[col_name].shift(-i)
df['M_A'] = temp_df[ma_cols].stack().groupby(level=0).head(3).mean(level=0)#change 3 to k
df['M_B'] = temp_df[mb_cols].stack().groupby(level=0).head(3).mean(level=0)
df
Out[35]:
Price mean_a mean_b M_A M_B
0 1.00 1.000000 5.490000 1.000000 5.490000
1 12.23 6.615000 9.380000 6.615000 9.380000
2 3.24 5.490000 55.296667 5.490000 55.296667
3 12.67 9.380000 60.876667 9.380000 60.876667
6 149.98 55.296667 684.396667 55.296667 684.396667
7 19.98 60.876667 635.066667 60.876667 635.066667
8 1883.23 684.396667 630.036667 684.396667 630.036667
9 1.99 635.066667 5.623333 635.066667 5.623333
10 4.89 630.036667 9.290000 630.036667 9.290000
11 9.99 5.623333 13.736667 5.623333 13.736667
12 12.99 9.290000 16.403333 9.290000 16.403333
13 18.23 13.736667 18.400000 13.736667 18.400000
14 17.99 16.403333 18.360000 16.403333 18.360000
15 18.98 18.400000 18.730000 18.400000 18.730000
16 18.11 18.360000 19.170000 18.360000 19.170000
17 19.10 18.730000 646.900000 18.730000 646.900000
18 20.30 19.170000 647.290000 19.170000 647.290000
19 1901.30 646.900000 960.785000 646.900000 960.785000
20 20.27 647.290000 20.270000 647.290000 20.270000发布于 2018-06-10 16:49:49
就像@Wen说的那样:您可以使用滚动函数计算mean_a:
df['mean_a'] = df['Price'].rolling(3).mean()df['mean_b']只是df['mean_a']被-2移位了
df['mean_b'] = df['mean_a'].shift(-2)这将返回:
# Price mean_a mean_b
0 0 1.00 NaN 5.490000
1 1 12.23 NaN 9.380000
2 2 3.24 5.490000 55.296667
3 3 12.67 9.380000 60.876667
4 6 149.98 55.296667 684.396667
5 7 19.98 60.876667 635.066667
6 8 1883.23 684.396667 630.036667
7 9 1.99 635.066667 5.623333
8 10 4.89 630.036667 9.290000
9 11 9.99 5.623333 13.736667
10 12 12.99 9.290000 16.403333
11 13 18.23 13.736667 18.400000
12 14 17.99 16.403333 18.360000
13 15 18.98 18.400000 18.730000
14 16 18.11 18.360000 19.170000
15 17 19.10 18.730000 646.900000
16 18 20.30 19.170000 7397.200000
17 19 1901.30 646.900000 NaN
18 20 20270.00 7397.200000 NaN编辑:
如果要避免某些值是NA,则需要使用min_periods参数。我们可以通过df['mean_a'] = df['Price'].rolling(3, min_periods = 1).mean()创建df['mean_a'] = df['Price'].rolling(3, min_periods = 1).mean(),但是现在我们不能通过移位来创建mean_b --除了@Wen的方法之外,我想不出另一种简单的方法。(逆转分段的price级数,其中df['mean_b']是na
df['mean_b'] = df['mean_a'].shift(-2)
df['mean_b'][df['mean_b'].isna()] = df['Price']df['mean_b'].isna()].iloc[::-1].rolling(3,min_periods =1).mean()不过,如果我们一开始就把整个系列都颠倒过来,那就不那么麻烦了。
发布于 2018-06-10 17:48:20
def moving_average(df, k=10):
mean_a = pd.Series()
mean_b = pd.Series()
for i in range(df.shape[0]):
mean_a = mean_a.append(df.iloc[i:i+k].mean(), ignore_index=True)
start_b = i-k+1 if i-k+1>=0 else 0
mean_b = mean_b.append(df.iloc[start_b:i+1].mean(), ignore_index=True)
hold = df.copy()
hold["mean_a"] = mean_a
hold["mean_b"] = mean_b
return holdhttps://stackoverflow.com/questions/50785716
复制相似问题