是否有方法从LSTM (或其他递归)神经网络计算时间序列预测的预测区间(概率分布)?
例如,我预测未来将有10个样本(t+1到t+10),基于最近10个观测样本(t-9到t),我希望t+1的预测比t+10的预测更准确。通常,人们可能会在预测周围绘制误差栏来显示间隔。使用ARIMA模型(在正态分布误差假设下),我可以计算每个预测值的预测间隔(例如95%)。我能从LSTM模型中计算同样的(或与预测间隔有关的东西)吗?
我一直在使用Keras/Python中的LSTM,下面是来自machinelearningmastery.com的许多示例,我的示例代码(下面)是基于这些示例的。我正在考虑将问题重新划分为离散的回收箱,因为这会产生每个类的信心,但这似乎是一个糟糕的解决方案。
有几个类似的主题(如下文),但似乎没有什么能直接解决来自LSTM (甚至其他)神经网络的预测间隔问题:
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
from math import sin
from matplotlib import pyplot
import numpy as np
# Build an LSTM network and train
def fit_lstm(X, y, batch_size, nb_epoch, neurons):
X = X.reshape(X.shape[0], 1, X.shape[1]) # add in another dimension to the X data
y = y.reshape(y.shape[0], y.shape[1]) # but don't add it to the y, as Dense has to be 1d?
model = Sequential()
model.add(LSTM(neurons, batch_input_shape=(batch_size, X.shape[1], X.shape[2]), stateful=True))
model.add(Dense(y.shape[1]))
model.compile(loss='mean_squared_error', optimizer='adam')
for i in range(nb_epoch):
model.fit(X, y, epochs=1, batch_size=batch_size, verbose=1, shuffle=False)
model.reset_states()
return model
# Configuration
n = 5000 # total size of dataset
SLIDING_WINDOW_LENGTH = 30
SLIDING_WINDOW_STEP_SIZE = 1
batch_size = 10
test_size = 0.1 # fraction of dataset to hold back for testing
nb_epochs = 100 # for training
neurons = 8 # LSTM layer complexity
# create dataset
#raw_values = [sin(i/2) for i in range(n)] # simple sine wave
raw_values = [sin(i/2)+sin(i/6)+sin(i/36)+np.random.uniform(-1,1) for i in range(n)] # double sine with noise
#raw_values = [(i%4) for i in range(n)] # saw tooth
all_data = np.array(raw_values).reshape(-1,1) # make into array, add anothe dimension for sci-kit compatibility
# data is segmented using a sliding window mechanism
all_data_windowed = [np.transpose(all_data[idx:idx+SLIDING_WINDOW_LENGTH]) for idx in np.arange(0,len(all_data)-SLIDING_WINDOW_LENGTH, SLIDING_WINDOW_STEP_SIZE)]
all_data_windowed = np.concatenate(all_data_windowed, axis=0).astype(np.float32)
# split data into train and test-sets
# round datasets down to a multiple of the batch size
test_length = int(round((len(all_data_windowed) * test_size) / batch_size) * batch_size)
train, test = all_data_windowed[:-test_length,:], all_data_windowed[-test_length:,:]
train_length = int(np.floor(train.shape[0] / batch_size)*batch_size)
train = train[:train_length,...]
half_size = int(SLIDING_WINDOW_LENGTH/2) # split the examples half-half, to forecast the second half
X_train, y_train = train[:,:half_size], train[:,half_size:]
X_test, y_test = test[:,:half_size], test[:,half_size:]
# fit the model
lstm_model = fit_lstm(X_train, y_train, batch_size=batch_size, nb_epoch=nb_epochs, neurons=neurons)
# forecast the entire training dataset to build up state for forecasting
X_train_reshaped = X_train.reshape(X_train.shape[0], 1, X_train.shape[1])
lstm_model.predict(X_train_reshaped, batch_size=batch_size)
# predict from test dataset
X_test_reshaped = X_test.reshape(X_test.shape[0], 1, X_test.shape[1])
yhat = lstm_model.predict(X_test_reshaped, batch_size=batch_size)
#%% Plot prediction vs actual
x_axis_input = range(half_size)
x_axis_output = [x_axis_input[-1]] + list(half_size+np.array(range(half_size)))
fig = pyplot.figure()
ax = fig.add_subplot(111)
line1, = ax.plot(x_axis_input,np.zeros_like(x_axis_input), 'r-')
line2, = ax.plot(x_axis_output,np.zeros_like(x_axis_output), 'o-')
line3, = ax.plot(x_axis_output,np.zeros_like(x_axis_output), 'g-')
ax.set_xlim(np.min(x_axis_input),np.max(x_axis_output))
ax.set_ylim(-4,4)
pyplot.legend(('Input','Actual','Predicted'),loc='upper left')
pyplot.show()
# update plot in a loop
for idx in range(y_test.shape[0]):
sample_input = X_test[idx]
sample_truth = [sample_input[-1]] + list(y_test[idx]) # join lists
sample_predicted = [sample_input[-1]] + list(yhat[idx])
line1.set_ydata(sample_input)
line2.set_ydata(sample_truth)
line3.set_ydata(sample_predicted)
fig.canvas.draw()
fig.canvas.flush_events()
pyplot.pause(.25)发布于 2017-11-06 12:47:08
直接来说,这是不可能的。然而,如果你用一种不同的方式来建模,你就可以得到自信的间隔。你可以用一个正常的回归方法代替它来估计一个连续的概率分布。通过对每个步骤执行此操作,您可以绘制您的发行版。实现这一目的的方法是内核混合网络(https://janvdvegt.github.io/2017/06/07/Kernel-Mixture-Networks.html,distributions )或密度混合网络(http://www.cedar.buffalo.edu/~srihari/CSE574/Chap5/Chap5.7-MixDensityNetworks.pdf),第一种方法使用内核作为基础,并估计这些核上的混合数据,第二种方法估计分布的混合,包括每个分布的参数。您可以使用日志可能性来训练模型。
对不确定性建模的另一个选择是在训练期间和推理过程中使用退出。你做了多次,每次从你的后部得到一个样本。您没有得到发行版,只有示例,但是它是最容易实现的,并且工作得很好。
在您的情况下,您必须考虑生成t+2到t+10的方式,根据当前的设置,您可能必须从前面的时间步骤中进行示例,并为下一个步骤提供数据。对于第一种方法,或者第二种方法,这都不是很好。如果您每个时间步骤有10个输出(t+1到t+10),那么所有这些方法都比较干净,但不太直观。
发布于 2018-05-14 08:54:05
保角形预测作为一个热门词可能是有趣的,因为它在许多条件下工作-特别是它不需要正态分布错误,它适用于几乎任何机器学习模型。
斯科特·洛克林和亨里克·利努森给出了两个很好的介绍。
发布于 2017-11-07 04:30:05
我要稍微发散一下,并认为计算置信区间在实践中通常并不是一件有价值的事情。原因是你总是需要做大量的假设。即使对于最简单的线性回归,您也需要
一个更实际的方法是进行蒙特卡罗模拟。如果你已经知道或愿意假设你的输入变量的分布,拿一整堆样本给你LSTM,现在你可以经验地计算出你的“置信区间”。
https://datascience.stackexchange.com/questions/24403
复制相似问题