文章/答案/技术大牛

发布

社区首页 >问答首页 >NStepLSTM和Seq2Seq模型

问NStepLSTM和Seq2Seq模型
EN

Stack Overflow用户

提问于 2017-11-09 13:52:23

回答 1查看 244关注 0票数 0

亲爱的chainer社区，

我无法抗拒NStepLSTM在seq2seq 官方例子 (英语到法语翻译)中的逻辑用法。

def __call__(self, xs, ys): xs = [x[::-1] for x in xs] #Reverse x据我所知，xs是英语短语，ys是法语短语。你为什么要倒转英语短语？
你是如何训练网络的？您将xs和ys嵌入到连续空间中，然后向编码器提供exs，以获得英语短语的潜在表示。然后用eys将潜在的表示放入译码器中。但是eys是法语短语的连续表示，在测试阶段解码器不能知道产生的法语短语，对吗？你如何应用你的网络？ hx, cx, _ = self.the encoder(None, None, exs) _, _, os = self.decoder(hx, cx, eys)
ys_in = [F.concat([eos, y], axis=0) for y in ys]，为什么我们把end of sequence放在开头？
ys = self.xp.full(batch, EOS, 'i')在def translate，我们把阵列的end of sequence译码器，为什么？

如果我不想翻译句子，而是构建一个自动编码器来将短语映射到潜在空间，我该怎么办？

python

chainer

回答 1

Stack Overflow用户

发布于 2017-11-22 07:23:33

谢谢你的问题。

问题1

请参阅下面的seq2seq原始文件。他们建议：Note that the LSTM reads the input sentence in reverse, because doing so introduces many short term dependencies in the data that make the optimization problem much easier.

(摘要) we found that reversing the order of the words in all source sentences (but not target sentences) improved the LSTM’s performance markedly https://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf

我认为官方的示例代码也颠倒了上面的输入句子。

问题2 然后用eys将潜在的表示放入译码器中。但是eys是法语短语的连续表示。

Yes.This码用于训练时间，因此我们知道目标句(金字)。

hx, cx, _ = self.the encoder(None, None, exs)
_, _, os = self.decoder(hx, cx, eys)

在测试时，您应该使用def translate(self, xs, max_length=100):。该方法可用于从源句xs中预测句子。

result = []
for i in range(max_length):
    eys = self.embed_y(ys)
    eys = F.split_axis(eys, batch, 0)
    h, c, ys = self.decoder(h, c, eys)
    cys = F.concat(ys, axis=0)
    wy = self.W(cys)
    ys = self.xp.argmax(wy.data, axis=1).astype('i')
    result.append(ys)

对于每个循环，使用源语句向量和前一个单词ys来预测一个单词。

问题3问题4

我认为这一部分应该是这样的：ys_in = [F.concat([bos, y], axis=0) for y in ys] (开始句)官方代码同时使用eos。

最后问题如果我不想翻译句子，而是构建一个自动编码器来将短语映射到潜在空间，我该怎么办？

当您想要构建一个自动编码器时，

删除这一行xs = [x[::-1] for x in xs]
在bos中使用eos而不是eos

无论是否使用eos而不是bos，两者都很好。您只需删除自动编码器的这一行xs = [x[::-1] for x in xs]。

如果您想使用bos，您应该修改如下：

UNK = 0
EOS = 1
BOS = 2



47:eos = self.xp.array([EOS], 'i')
48: ys_in = [F.concat([eos, y], axis=0) for y in ys]
=>
bos = self.xp.array([BOS], 'i')
ys_in = [F.concat([bos, y], axis=0) for y in ys]
79: ys = self.xp.full(batch, EOS, 'i')
=> 
ys = self.xp.full(batch, BOS, 'i')

def load_vocabulary(path):
    with open(path) as f:
        # +2 for UNK and EOS
        word_ids = {line.strip(): i + 3 for i, line in enumerate(f)}
    word_ids['<UNK>'] = 0
    word_ids['<EOS>'] = 1
    word_ids['<BOS>'] = 2
    return word_ids

如果您有进一步的问题，请再问我一次。

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/47203602

复制

相似问题

问NStepLSTM和Seq2Seq模型
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问NStepLSTM和Seq2Seq模型EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问NStepLSTM和Seq2Seq模型
EN