我想要确保我理解LSTM,所以我使用Pytorch框架实现了一个虚拟示例。作为输入,我使用长度为10的连续数序列,预测值总是序列+ 1的最后一个数。例如:
X= 6、7、8、9、10、11、12、13、14、15
Y= 16
由于这是一个非常简单的预测任务,我期望模型能很好地工作,但我观察到很差的表现。该模型预测了训练过程中不断增加的一批恒定值。
我想知道我错过了什么。下面是我所做的代码-任何帮助都是非常感谢的。
from torch.utils.data import Dataset, TensorDataset, DataLoader, RandomSampler, SequentialSampler
import torch.nn as nn
import torchclass MyDataset(Dataset):
def __init__(self):
pass
def __getitem__(self, index):
x = torch.tensor([index-9,index-8,index-7,index-6,index-5,index-4,index-3,index-2,index-1,index])
y = torch.tensor(index + 1)
return x,y
def __len__(self):
return 1000class LSTM(nn.Module):
def __init__(self, hidden_layer_size=1, batch_size = 1):
super().__init__()
self.hidden_layer_size = hidden_layer_size
self.batch_size = batch_size
self.lstm = nn.LSTM(1, hidden_layer_size)
self.linear = nn.Linear(10, 1)
self.hidden_cell = (torch.zeros(1,self.batch_size,self.hidden_layer_size),
torch.zeros(1,self.batch_size,self.hidden_layer_size))
def forward(self, input_seq):
lstm_out, self.hidden_cell = self.lstm(input_seq.view(10 ,self.batch_size, -1), self.hidden_cell)
predictions = self.linear(lstm_out.squeeze().T)
return predictionsbatch_size = 32
epochs = 1000
train = MyDataset()
sampler = RandomSampler(train)
train_dataloader = DataLoader(train, sampler=sampler, batch_size= batch_size , drop_last = True)
model = LSTM(batch_size = batch_size)
loss_function = nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
for e in range(epochs):
for step, batch in enumerate(train_dataloader) :
seq, labels = batch
optimizer.zero_grad()
model.hidden_cell = (torch.zeros(1, batch_size, model.hidden_layer_size),
torch.zeros(1, batch_size, model.hidden_layer_size))
y_pred = model(seq.float())
print(y_pred)
single_loss = loss_function(y_pred, labels.float())
single_loss.backward()
optimizer.step()发布于 2020-04-26 17:08:20
您的前向功能中有多个问题。查看要传递给LSTM的输入:
input_seq = input_seq.view(10 ,self.batch_size, -1)
print(input_seq[:, 0])
>>> tensor([[168.],
[ 21.],
[450.],
[436.],
[789.],
[941.],
[ -7.],
[811.],
[789.],
[992.]])这是一系列随机数。您要么转换input_seq,要么更好地将batch_first=True传递给LSTM构造函数,在将其传递给LSTM之前只需将unsqueeze传递给input_seq。
您还必须更新lstm_out,现在唯一需要的操作是将其转换为[batch_size x (10 * hidden_size)]。
最后,您需要squeeze线性层的输出。
除此之外,LSTM的隐藏尺寸太小,使用10 (甚至100)代替一个,只有在1000个年代模型才会收敛。以下是更新的代码:
class LSTM(nn.Module):
def __init__(self, hidden_layer_size=100, batch_size = 1):
super().__init__()
self.hidden_layer_size = hidden_layer_size
self.batch_size = batch_size
self.lstm = nn.LSTM(1, hidden_layer_size, batch_first=True)
self.linear = nn.Linear(10 * hidden_layer_size, 1)
self.hidden_cell = (torch.zeros(1,self.batch_size,self.hidden_layer_size),
torch.zeros(1,self.batch_size,self.hidden_layer_size))
def forward(self, input_seq):
batch_size = input_seq.size(0)
input_seq = input_seq.unsqueeze(2)
lstm_out, self.hidden_cell = self.lstm(input_seq, self.hidden_cell)
lstm_out = lstm_out.reshape(batch_size, -1)
predictions = self.linear(lstm_out).squeeze()
return predictionshttps://stackoverflow.com/questions/61435747
复制相似问题