文章/答案/技术大牛

发布

问LSTM时间序列预测区间
EN

Data Science用户

提问于 2017-11-06 12:16:39

回答 4查看 16.3K关注 0票数 17

是否有方法从LSTM (或其他递归)神经网络计算时间序列预测的预测区间(概率分布)？

例如，我预测未来将有10个样本(t+1到t+10)，基于最近10个观测样本(t-9到t)，我希望t+1的预测比t+10的预测更准确。通常，人们可能会在预测周围绘制误差栏来显示间隔。使用ARIMA模型(在正态分布误差假设下)，我可以计算每个预测值的预测间隔(例如95%)。我能从LSTM模型中计算同样的(或与预测间隔有关的东西)吗？

我一直在使用Keras/Python中的LSTM，下面是来自machinelearningmastery.com的许多示例，我的示例代码(下面)是基于这些示例的。我正在考虑将问题重新划分为离散的回收箱，因为这会产生每个类的信心，但这似乎是一个糟糕的解决方案。

有几个类似的主题(如下文)，但似乎没有什么能直接解决来自LSTM (甚至其他)神经网络的预测间隔问题：

https://stats.stackexchange.com/questions/25055/how-to-calculate-the-confidence-interval-for-time-series-prediction

ARIMA与LSTM的时间序列预测

from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
from math import sin
from matplotlib import pyplot
import numpy as np

# Build an LSTM network and train
def fit_lstm(X, y, batch_size, nb_epoch, neurons):
    X = X.reshape(X.shape[0], 1, X.shape[1]) # add in another dimension to the X data
    y = y.reshape(y.shape[0], y.shape[1])      # but don't add it to the y, as Dense has to be 1d?
    model = Sequential()
    model.add(LSTM(neurons, batch_input_shape=(batch_size, X.shape[1], X.shape[2]), stateful=True))
    model.add(Dense(y.shape[1]))
    model.compile(loss='mean_squared_error', optimizer='adam')
    for i in range(nb_epoch):
        model.fit(X, y, epochs=1, batch_size=batch_size, verbose=1, shuffle=False)
        model.reset_states()
    return model

# Configuration
n = 5000    # total size of dataset
SLIDING_WINDOW_LENGTH = 30
SLIDING_WINDOW_STEP_SIZE = 1
batch_size = 10
test_size = 0.1 # fraction of dataset to hold back for testing
nb_epochs = 100 # for training
neurons = 8 # LSTM layer complexity

# create dataset
#raw_values = [sin(i/2) for i in range(n)]  # simple sine wave
raw_values = [sin(i/2)+sin(i/6)+sin(i/36)+np.random.uniform(-1,1) for i in range(n)]  # double sine with noise
#raw_values = [(i%4) for i in range(n)] # saw tooth

all_data = np.array(raw_values).reshape(-1,1) # make into array, add anothe dimension for sci-kit compatibility

# data is segmented using a sliding window mechanism
all_data_windowed = [np.transpose(all_data[idx:idx+SLIDING_WINDOW_LENGTH]) for idx in np.arange(0,len(all_data)-SLIDING_WINDOW_LENGTH, SLIDING_WINDOW_STEP_SIZE)]
all_data_windowed = np.concatenate(all_data_windowed, axis=0).astype(np.float32)

# split data into train and test-sets
# round datasets down to a multiple of the batch size
test_length = int(round((len(all_data_windowed) * test_size) / batch_size) * batch_size)
train, test = all_data_windowed[:-test_length,:], all_data_windowed[-test_length:,:]
train_length = int(np.floor(train.shape[0] / batch_size)*batch_size) 
train = train[:train_length,...]

half_size = int(SLIDING_WINDOW_LENGTH/2) # split the examples half-half, to forecast the second half
X_train, y_train = train[:,:half_size], train[:,half_size:]
X_test, y_test = test[:,:half_size], test[:,half_size:]

# fit the model
lstm_model = fit_lstm(X_train, y_train, batch_size=batch_size, nb_epoch=nb_epochs, neurons=neurons)

# forecast the entire training dataset to build up state for forecasting
X_train_reshaped = X_train.reshape(X_train.shape[0], 1, X_train.shape[1])
lstm_model.predict(X_train_reshaped, batch_size=batch_size)

# predict from test dataset
X_test_reshaped = X_test.reshape(X_test.shape[0], 1, X_test.shape[1])
yhat = lstm_model.predict(X_test_reshaped, batch_size=batch_size)

#%% Plot prediction vs actual

x_axis_input = range(half_size)
x_axis_output = [x_axis_input[-1]] + list(half_size+np.array(range(half_size)))

fig = pyplot.figure()
ax = fig.add_subplot(111)
line1, = ax.plot(x_axis_input,np.zeros_like(x_axis_input), 'r-')
line2, = ax.plot(x_axis_output,np.zeros_like(x_axis_output), 'o-')
line3, = ax.plot(x_axis_output,np.zeros_like(x_axis_output), 'g-')
ax.set_xlim(np.min(x_axis_input),np.max(x_axis_output))
ax.set_ylim(-4,4)
pyplot.legend(('Input','Actual','Predicted'),loc='upper left')
pyplot.show()

# update plot in a loop
for idx in range(y_test.shape[0]):

    sample_input = X_test[idx]
    sample_truth = [sample_input[-1]] + list(y_test[idx]) # join lists
    sample_predicted = [sample_input[-1]] + list(yhat[idx])

    line1.set_ydata(sample_input)
    line2.set_ydata(sample_truth)
    line3.set_ydata(sample_predicted)
    fig.canvas.draw()
    fig.canvas.flush_events()

    pyplot.pause(.25)

machine-learning

deep-learning

time-series

prediction

lstm

回答 4

Data Science用户

回答已采纳

发布于 2017-11-06 12:47:08

直接来说，这是不可能的。然而，如果你用一种不同的方式来建模，你就可以得到自信的间隔。你可以用一个正常的回归方法代替它来估计一个连续的概率分布。通过对每个步骤执行此操作，您可以绘制您的发行版。实现这一目的的方法是内核混合网络(https://janvdvegt.github.io/2017/06/07/Kernel-Mixture-Networks.html，distributions )或密度混合网络(http://www.cedar.buffalo.edu/~srihari/CSE574/Chap5/Chap5.7-MixDensityNetworks.pdf)，第一种方法使用内核作为基础，并估计这些核上的混合数据，第二种方法估计分布的混合，包括每个分布的参数。您可以使用日志可能性来训练模型。

对不确定性建模的另一个选择是在训练期间和推理过程中使用退出。你做了多次，每次从你的后部得到一个样本。您没有得到发行版，只有示例，但是它是最容易实现的，并且工作得很好。

在您的情况下，您必须考虑生成t+2到t+10的方式，根据当前的设置，您可能必须从前面的时间步骤中进行示例，并为下一个步骤提供数据。对于第一种方法，或者第二种方法，这都不是很好。如果您每个时间步骤有10个输出(t+1到t+10)，那么所有这些方法都比较干净，但不太直观。

票数 12

Data Science用户

发布于 2018-05-14 08:54:05

保角形预测作为一个热门词可能是有趣的，因为它在许多条件下工作-特别是它不需要正态分布错误，它适用于几乎任何机器学习模型。

斯科特·洛克林和亨里克·利努森给出了两个很好的介绍。

票数 2

Data Science用户

发布于 2017-11-07 04:30:05

我要稍微发散一下，并认为计算置信区间在实践中通常并不是一件有价值的事情。原因是你总是需要做大量的假设。即使对于最简单的线性回归，您也需要

线性关系
多元正态
没有或很少的多重共线性。
没有自相关关系。
同步性

一个更实际的方法是进行蒙特卡罗模拟。如果你已经知道或愿意假设你的输入变量的分布，拿一整堆样本给你LSTM，现在你可以经验地计算出你的“置信区间”。

票数 1

页面原文内容由Data Science提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://datascience.stackexchange.com/questions/24403

复制

相似问题

问LSTM时间序列预测区间
EN

回答 4

Data Science用户

Data Science用户

Data Science用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问LSTM时间序列预测区间EN

回答 4

Data Science用户

Data Science用户

Data Science用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问LSTM时间序列预测区间
EN