文章/答案/技术大牛

发布

社区首页 >问答首页 >如何不使用nn.RNN构建RNN？

问如何不使用nn.RNN构建RNN？
EN

Stack Overflow用户

提问于 2018-04-23 18:27:26

回答 1查看 688关注 0票数 0

我需要构建一个具有以下规范的RNN (不使用nn.RNN)：

它应该有一套重量[

- It is a chanracter RNN.
- It should have 1 hidden layer
- Wxh (from input layer to hidden layer )
- Whh (from the recurrent connection in the hidden layer)
- W ho  (from hidden layer to output layer)
- I need to use `Tanh` for hidden layer
- I need to use softmax for output layer.

我已经实现了代码。我使用CrossEntropyLoss()作为损失函数。这让我犯了错误

RuntimeError                              Traceback (most recent call last)
<ipython-input-33-94b42540bc4f> in <module>()
     25         print("target ",target_tensor[timestep])
     26 
---> 27         loss += criterion(output,target_tensor[timestep].view(1,n_vocab))
     28 
     29     loss.backward()

/opt/anaconda/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    323         for hook in self._forward_pre_hooks.values():
    324             hook(self, input)
--> 325         result = self.forward(*input, **kwargs)
    326         for hook in self._forward_hooks.values():
    327             hook_result = hook(self, input, result)

/opt/anaconda/lib/python3.6/site-packages/torch/nn/modules/loss.py in forward(self, input, target)
    145         _assert_no_grad(target)
    146         return F.nll_loss(input, target, self.weight, self.size_average,
--> 147                           self.ignore_index, self.reduce)
    148 
    149 

/opt/anaconda/lib/python3.6/site-packages/torch/nn/functional.py in nll_loss(input, target, weight, size_average, ignore_index, reduce)
   1047         weight = Variable(weight)
   1048     if dim == 2:
-> 1049         return torch._C._nn.nll_loss(input, target, weight, size_average, ignore_index, reduce)
   1050     elif dim == 4:
   1051         return torch._C._nn.nll_loss2d(input, target, weight, size_average, ignore_index, reduce)

RuntimeError: multi-target not supported at /opt/conda/conda-bld/pytorch_1513368888240/work/torch/lib/THNN/generic/ClassNLLCriterion.c:22

下面是我的模型代码：

class CharRNN(torch.nn.Module):

    def __init__(self,input_size,hidden_size,output_size, n_layers = 1):

        super(CharRNN, self).__init__()
        self.input_size  = input_size
        self.hidden_size = hidden_size
        self.n_layers    = 1

        self.x2h_i = torch.nn.Linear(input_size + hidden_size, hidden_size)
        self.x2h_f = torch.nn.Linear(input_size + hidden_size, hidden_size)
        self.x2h_o = torch.nn.Linear(input_size + hidden_size, hidden_size)
        self.x2h_q = torch.nn.Linear(input_size + hidden_size, hidden_size)
        self.h2o   = torch.nn.Linear(hidden_size, output_size)
        self.sigmoid = torch.nn.Sigmoid()
        self.softmax = torch.nn.Softmax()
        self.tanh    = torch.nn.Tanh()

    def forward(self, input, h_t, c_t):

        combined_input = torch.cat((input,h_t),1)

        i_t = self.sigmoid(self.x2h_i(combined_input))
        f_t = self.sigmoid(self.x2h_f(combined_input))
        o_t = self.sigmoid(self.x2h_o(combined_input))
        q_t = self.tanh(self.x2h_q(combined_input))

        c_t_next = f_t*c_t + i_t*q_t
        h_t_next = o_t*self.tanh(c_t_next)

        output = self.softmax(h_t_next)
        return output, h_t, c_t

    def initHidden(self):
        return torch.autograd.Variable(torch.zeros(1, self.hidden_size))

    def weights_init(self,model):

        classname = model.__class__.__name__
        if classname.find('Linear') != -1:
            model.weight.data.normal_(0.0, 0.02)
            model.bias.data.fill_(0)

这就是训练模型的代码：

`
input_tensor  = torch.autograd.Variable(torch.zeros(seq_length,n_vocab))
target_tensor = torch.autograd.Variable(torch.zeros(seq_length,n_vocab))

model   = CharRNN(input_size = n_vocab, hidden_size = hidden_size, output_size = output_size)
model.apply(model.weights_init)

criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr = learning_rate)

for i in range(n_epochs):
    print("Iteration", i)

    start_idx    = np.random.randint(0, n_chars-seq_length-1)
    train_data   = raw_text[start_idx:start_idx + seq_length + 1]

    input_tensor = torch.autograd.Variable(seq2tensor(train_data[:-1],n_vocab), requires_grad = True)
    target_tensor= torch.autograd.Variable(seq2tensor(train_data[1:],n_vocab), requires_grad = False).long()

    loss = 0

    h_t = torch.autograd.Variable(torch.zeros(1,hidden_size))
    c_t = torch.autograd.Variable(torch.zeros(1,hidden_size))

    for timestep in range(seq_length):

        output, h_t, c_t = model(input_tensor[timestep].view(1,n_vocab), h_t, c_t)

        loss += criterion(output,target_tensor[timestep].view(1,n_vocab))

    loss.backward()
    optimizer.step()
    optimizer.zero_grad()

    x_t = input_tensor[0].view(1,n_vocab)
    h_t = torch.autograd.Variable(torch.zeros(1,hidden_size))
    c_t = torch.autograd.Variable(torch.zeros(1,hidden_size))

    gen_seq = []

    for timestep in range(100):
        output, h_t, c_t = model(x_t, h_t, c_t)
        ix = np.random.choice(range(n_vocab), p=output.data.numpy().ravel())
        x_t = torch.autograd.Variable(torch.zeros(1,n_vocab))
        x_t[0,ix] = 1
        gen_seq.append(idx2char[ix])

    txt = ''.join(gen_seq)
    print ('----------------------')
    print (txt)
    print ('----------------------')

你能帮帮我吗？

提前谢谢。

rnn

python-3.x

deep-learning

recurrent-neural-network

pytorch

回答 1

Stack Overflow用户

发布于 2018-04-24 08:35:49

问题是你的目标张量。它是形状的1, n_classes，一个二维张量，但CrossEntropyLoss期望一个一维张量。

或者用其他术语说明，您提供的是一个热编码的目标张量，但是损失函数期望从0到n_classes-1的类号。将损失计算改为-

one_hot_target = target_tensor[timestep].view(1,n_vocab)
_, class_target = torch.max(one_hot_target, dim=1)
loss += criterion(output, class_target)

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/49987673

复制

相似问题

问如何不使用nn.RNN构建RNN？
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何不使用nn.RNN构建RNN？EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何不使用nn.RNN构建RNN？
EN