我有一个时间序列是这样的:
2000 0.000
2001 -0.174
2002 -0.131
2003 0.127
2004 0.566
2005 0.723
2006 0.675
2007 1.171
2008 2.338
2009 2.625
2010 3.746
2011 3.612
2012 4.729
2013 8.156
2014 16.330
2015 27.584估计此序列的线性趋势,然后计算趋势线和序列之间的差距的最有效方法是什么?
提前谢谢你!
发布于 2019-10-08 06:14:44
使用scikit-learn中的简单线性回归模型
import pandas as pd
import numpy as np
from sklearn.linear_model import LinearRegression
a = {'year':[2000,2001,2002,2003,2004,2005,2006,2007,2008,2009,2010,2011,2012,2013,2014,2015],'y_true':[0,-0.174,-0.131,0.127,0.566,0.723,0.675,1.171,2.338,2.625,3.746,3.612,4.729,8.156,16.330,27.584]}
df = pd.DataFrame(a)
x = np.array(df['year']).reshape(-1,1)
y_true = df['y_true']
linear_reg = LinearRegression().fit(x,y_true)
y_pred = linear_reg.predict(x)
df['y_pred'] = y_pred
df['difference'] = y_true - y_pred
print(df)输出:
year y_true y_pred difference
0 2000 0.000 -4.366596 4.366596
1 2001 -0.174 -3.183741 3.009741
2 2002 -0.131 -2.000887 1.869887
3 2003 0.127 -0.818032 0.945032
4 2004 0.566 0.364822 0.201178
5 2005 0.723 1.547676 -0.824676
6 2006 0.675 2.730531 -2.055531
7 2007 1.171 3.913385 -2.742385
8 2008 2.338 5.096240 -2.758240
9 2009 2.625 6.279094 -3.654094
10 2010 3.746 7.461949 -3.715949
11 2011 3.612 8.644803 -5.032803
12 2012 4.729 9.827657 -5.098657
13 2013 8.156 11.010512 -2.854512
14 2014 16.330 12.193366 4.136634
15 2015 27.584 13.376221 14.207779https://stackoverflow.com/questions/58277633
复制相似问题