首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >Keras seq2seq -字嵌入

Keras seq2seq -字嵌入
EN

Stack Overflow用户
提问于 2018-03-25 14:45:39
回答 2查看 5.3K关注 0票数 6

我正在研究一个基于Keras中seq2seq的生成式聊天机器人。我使用了这个站点的代码:https://machinelearningmastery.com/develop-encoder-decoder-model-sequence-sequence-prediction-keras/

我的模特看起来是这样的:

代码语言:javascript
复制
# define training encoder
encoder_inputs = Input(shape=(None, n_input))
encoder = LSTM(n_units, return_state=True)
encoder_outputs, state_h, state_c = encoder(encoder_inputs)
encoder_states = [state_h, state_c]

# define training decoder
decoder_inputs = Input(shape=(None, n_output))
decoder_lstm = LSTM(n_units, return_sequences=True, return_state=True)
decoder_outputs, _, _ = decoder_lstm(decoder_inputs, initial_state=encoder_states)
decoder_dense = Dense(n_output, activation='softmax')
decoder_outputs = decoder_dense(decoder_outputs)
model = Model([encoder_inputs, decoder_inputs], decoder_outputs)

# define inference encoder
encoder_model = Model(encoder_inputs, encoder_states)

# define inference decoder
decoder_state_input_h = Input(shape=(n_units,))
decoder_state_input_c = Input(shape=(n_units,))
decoder_states_inputs = [decoder_state_input_h, decoder_state_input_c]
decoder_outputs, state_h, state_c = decoder_lstm(decoder_inputs, initial_state=decoder_states_inputs)
decoder_states = [state_h, state_c]
decoder_outputs = decoder_dense(decoder_outputs)
decoder_model = Model([decoder_inputs] + decoder_states_inputs [decoder_outputs] + decoder_states)

这个神经网络被设计为与一个热编码向量一起工作,这个网络的输入看起来如下:

代码语言:javascript
复制
[[[0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
   0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
   0. 0. 0. 0. 0.]
  [0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
   0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
   0. 0. 0. 0. 0.]
  [0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
   0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
   0. 0. 0. 0. 0.]]
  [[0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
   0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
   0. 0. 0. 0. 0.]
  [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0.
   0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
   0. 0. 0. 0. 0.]
  [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0.
   0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
   0. 0. 0. 0. 0.]]]

我怎样才能重建这些模型来使用文字呢?我想使用字嵌入层,但我不知道如何将嵌入层连接到这些模型。

我的输入应该是[[1,5,6,7,4], [4,5,7,5,4], [7,5,4,2,1]],其中int数是单词的表示。

我什么都试过了,但还是有错误。你能帮帮我吗?

EN

回答 2

Stack Overflow用户

回答已采纳

发布于 2018-04-27 07:06:20

我终于做到了。以下是代码:

代码语言:javascript
复制
Shared_Embedding = Embedding(output_dim=embedding, input_dim=vocab_size, name="Embedding")

encoder_inputs = Input(shape=(sentenceLength,), name="Encoder_input")
encoder = LSTM(n_units, return_state=True, name='Encoder_lstm') 
word_embedding_context = Shared_Embedding(encoder_inputs) 
encoder_outputs, state_h, state_c = encoder(word_embedding_context) 
encoder_states = [state_h, state_c] 
decoder_lstm = LSTM(n_units, return_sequences=True, return_state=True, name="Decoder_lstm")

decoder_inputs = Input(shape=(sentenceLength,), name="Decoder_input")
word_embedding_answer = Shared_Embedding(decoder_inputs) 
decoder_outputs, _, _ = decoder_lstm(word_embedding_answer, initial_state=encoder_states) 
decoder_dense = Dense(vocab_size, activation='softmax', name="Dense_layer") 
decoder_outputs = decoder_dense(decoder_outputs) 

model = Model([encoder_inputs, decoder_inputs], decoder_outputs)

encoder_model = Model(encoder_inputs, encoder_states) 

decoder_state_input_h = Input(shape=(n_units,), name="H_state_input") 
decoder_state_input_c = Input(shape=(n_units,), name="C_state_input") 
decoder_states_inputs = [decoder_state_input_h, decoder_state_input_c] 
decoder_outputs, state_h, state_c = decoder_lstm(word_embedding_answer, initial_state=decoder_states_inputs) 
decoder_states = [state_h, state_c] 
decoder_outputs = decoder_dense(decoder_outputs)

decoder_model = Model([decoder_inputs] + decoder_states_inputs, [decoder_outputs] + decoder_states)

“模型”是训练模型,encoder_model和decoder_model是推理模型。

票数 6
EN

Stack Overflow用户

发布于 2018-03-26 14:33:20

在下面这个示例的FAQ部分中,它们提供了一个如何在seq2seq中使用嵌入式的示例。我目前正在自己找出推理步骤。我收到后会在这里发邮件。https://blog.keras.io/a-ten-minute-introduction-to-sequence-to-sequence-learning-in-keras.html

票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/49477097

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档