首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >尝试构建具有注意力机制的编码器解码器,但图形结果总是断开,请您帮助我?

尝试构建具有注意力机制的编码器解码器,但图形结果总是断开,请您帮助我?
EN

Stack Overflow用户
提问于 2020-11-18 17:51:30
回答 1查看 67关注 0票数 1
代码语言:javascript
复制
dimensionality = 4

#trainint encoder
encoder_inputs = Input(shape=(None, num_encoder_tokens))
decoder_inputs = Input(shape=(None, num_decoder_tokens))

encoder = Bidirectional(LSTM(dimensionality, return_sequences=True, return_state=True, 
                             go_backwards=True), merge_mode='sum')
encoder_outputs, for_h, for_c, bac_h, bac_c = encoder(encoder_inputs)
encoder_states = [tf.add(for_h, for_c), tf.add(bac_h, bac_h) ]

#training decoder
decoder = LSTM(dimensionality, return_sequences=True, return_state=True)
decoder_outputs, _, _ = decoder(decoder_inputs, initial_state= encoder_states)

dot_prod = dot([decoder_outputs, encoder_outputs], axes=[2, 2])
attention = Activation('softmax', name='attention')
attention_vec = attention(dot_prod)

context = dot([attention_vec, encoder_outputs], axes=[2, 1])
decoder_comb = concatenate([context, decoder_outputs], name='decoder_comb')

dense = Dense(num_decoder_tokens, activation='softmax')
output = dense(decoder_comb)


training_model = Model([encoder_inputs, decoder_inputs], output)

您可以在此处找到摘要:

代码语言:javascript
复制
    Model: "functional_12"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_13 (InputLayer)           [(None, None, 1780)] 0                                            
__________________________________________________________________________________________________
bidirectional_2 (Bidirectional) [(None, None, 4), (N 57120       input_13[0][0]                   
__________________________________________________________________________________________________
input_14 (InputLayer)           [(None, None, 2257)] 0                                            
__________________________________________________________________________________________________
tf_op_layer_Add_4 (TensorFlowOp [(None, 4)]          0           bidirectional_2[0][1]            
                                                                 bidirectional_2[0][2]            
__________________________________________________________________________________________________
tf_op_layer_Add_5 (TensorFlowOp [(None, 4)]          0           bidirectional_2[0][3]            
                                                                 bidirectional_2[0][3]            
__________________________________________________________________________________________________
lstm_5 (LSTM)                   [(None, None, 4), (N 36192       input_14[0][0]                   
                                                                 tf_op_layer_Add_4[0][0]          
                                                                 tf_op_layer_Add_5[0][0]          
__________________________________________________________________________________________________
dot_12 (Dot)                    (None, None, None)   0           lstm_5[0][0]                     
                                                                 bidirectional_2[0][0]            
__________________________________________________________________________________________________
attention (Activation)          (None, None, None)   0           dot_12[0][0]                     
__________________________________________________________________________________________________
dot_13 (Dot)                    (None, None, 4)      0           attention[0][0]                  
                                                                 bidirectional_2[0][0]            
__________________________________________________________________________________________________
decoder_comb (Concatenate)      (None, None, 8)      0           dot_13[0][0]                     
                                                                 lstm_5[0][0]                     
__________________________________________________________________________________________________
dense_2 (Dense)                 (None, None, 2257)   20313       decoder_comb[0][0]               
==================================================================================================
Total params: 113,625
Trainable params: 113,625
Non-trainable params: 0
__________________________________________________________________________________________________

最后,我在下面粘贴了分离编码器和解码器的尝试,以便进行推理,但它引发了一个错误。我尽可能地尝试使用training_model层/输出/输入,但仍然缺少一些东西。

代码语言:javascript
复制
 #inference encoder
encoder_model = Model(encoder_inputs, encoder_states)

#inference decoder
decoder_s_h = Input(shape=(dimensionality, ))
decoder_s_c = Input(shape=(dimensionality, ))
decoder_states_inputs = [decoder_s_h, decoder_s_c]
decoder_outputs, state_h, state_c = decoder(decoder_inputs, initial_state=decoder_states_inputs)
decoder_states = [state_h, state_c]

dot_prod = dot([decoder_outputs, encoder_outputs], axes=[2, 2])
attention_vec = attention(dot_prod)

context = dot([attention_vec, encoder_outputs], axes=[2, 1])

decoder_comb = concatenate([context, decoder_outputs])

output= dense(decoder_comb)

decoder_model = Model([decoder_inputs] + decoder_states_inputs, [output] + decoder_states)

我尝试了很多次来改变这个配置,但是我不能解决图形断开问题。你能帮我一下吗?PS。我是NLP的新手,所以请善待我,我还是一个学生,还不是一个深度学习专家…非常感谢您的时间和帮助!

代码语言:javascript
复制
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-44-89f9761124cf> in <module>()
     18 output= dense(decoder_comb)
     19 
---> 20 decoder_model = Model([decoder_inputs] + decoder_states_inputs, [output] + decoder_states)
     21 
     22 #encoder decoder model

5 frames
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/functional.py in _map_graph_network(inputs, outputs)
    929                              'The following previous layers '
    930                              'were accessed without issue: ' +
--> 931                              str(layers_with_complete_input))
    932         for x in nest.flatten(node.outputs):
    933           computable_tensors.add(id(x))

ValueError: Graph disconnected: cannot obtain value for tensor Tensor("input_13:0", shape=(None, None, 1780), dtype=float32) at layer "bidirectional_2". The following previous layers were accessed without issue: ['lstm_5']
EN

回答 1

Stack Overflow用户

发布于 2020-11-18 18:56:21

使用函数接口创建模型时,不能使用inputoutput属性。

试着通过这样的方式来改变:

代码语言:javascript
复制
encoder_inputs = Input(shape=(None, num_encoder_tokens))
decoder_inputs = Input(shape=(None, num_decoder_tokens))
encoder_outputs, for_hidden, for_cell, bac_hidden, bac_cell = training_model(encoder_input, decoder_inputs)

另一个错误与此行有关:

代码语言:javascript
复制
encoder_model = Model(encoder_inputs, encoder_states)

其中encoder_states不依赖于encoder_input。所以tensorflow不能构建图形。

票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/64890666

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档