文章/答案/技术大牛

发布

社区首页 >问答首页 >为什么构建timestep-手动展开LSTM与使用static_rnn有不同的输出？

问为什么构建timestep-手动展开LSTM与使用static_rnn有不同的输出？
EN

Stack Overflow用户

提问于 2021-05-20 05:03:39

回答 1查看 24关注 0票数 0

下面是我的代码手动构建LSTM：

import tensorflow as tf
import numpy as np
batch_size = 1
hidden_size = 4
num_steps = 3
input_dim = 5
np.random.seed(123)
input = np.ones([batch_size, num_steps, input_dim], dtype=int)
x = tf.placeholder(dtype=tf.float32, shape=[batch_size, num_steps, input_dim], name='input_x')
lstm_cell = tf.nn.rnn_cell.BasicLSTMCell(num_units=hidden_size)
initial_state = lstm_cell.zero_state(batch_size, dtype=tf.float32)
outputs = []
with tf.variable_scope('for_loop', initializer= tf.ones_initializer):
    for i in range(num_steps):
        if i > 0:
            tf.get_variable_scope().reuse_variables()
        output = lstm_cell(x[:, i, :], initial_state)
        outputs.append(output)
with tf.Session() as sess:
    init_op = tf.initialize_all_variables()
    sess.run(init_op)
    result = sess.run(outputs, feed_dict={x: input})
    print(result)

产出：

[(array([[0.7536526, 0.7536526, 0.7536526, 0.7536526]], dtype=float32), LSTMStateTuple(c=array([[0.99321693, 0.99321693, 0.99321693, 0.99321693]], dtype=float32), h=array([[0.7536526, 0.7536526, 0.7536526, 0.7536526]], dtype=float32))), 
(array([[0.7536526, 0.7536526, 0.7536526, 0.7536526]], dtype=float32), LSTMStateTuple(c=array([[0.99321693, 0.99321693, 0.99321693, 0.99321693]], dtype=float32), h=array([[0.7536526, 0.7536526, 0.7536526, 0.7536526]], dtype=float32))), 
(array([[0.7536526, 0.7536526, 0.7536526, 0.7536526]], dtype=float32), LSTMStateTuple(c=array([[0.99321693, 0.99321693, 0.99321693, 0.99321693]], dtype=float32), h=array([[0.7536526, 0.7536526, 0.7536526, 0.7536526]], dtype=float32)))]

虽然这是使用static_rnn的代码：

import tensorflow as tf
import numpy as np
batch_size = 1
hidden_size = 4
num_steps = 3
input_dim = 5
np.random.seed(123)
input = np.ones([batch_size, num_steps, input_dim], dtype=int)
x = tf.placeholder(dtype=tf.float32, shape=[batch_size, num_steps, input_dim], name='input_x')
lstm_cell = tf.nn.rnn_cell.BasicLSTMCell(num_units=hidden_size)
initial_state = lstm_cell.zero_state(batch_size, dtype=tf.float32)
y = tf.unstack(x, axis=1)
with tf.variable_scope('static_rnn', initializer= tf.ones_initializer):
    output, state = tf.nn.static_rnn(lstm_cell, y,  initial_state=initial_state)
with tf.Session() as sess:
    init_op = tf.initialize_all_variables()
    sess.run(init_op)
    result = (sess.run([output, state], feed_dict={x: input}))
    print(result)

产出：

[[array([[0.7536526, 0.7536526, 0.7536526, 0.7536526]], dtype=float32), 
array([[0.9631945, 0.9631945, 0.9631945, 0.9631945]], dtype=float32), 
array([[0.9948382, 0.9948382, 0.9948382, 0.9948382]], dtype=float32)], LSTMStateTuple(c=array([[2.9925175, 2.9925175, 2.9925175, 2.9925175]], dtype=float32), h=array([[0.9948382, 0.9948382, 0.9948382, 0.9948382]], dtype=float32))]

第一个单元得到完全相同的输出，但是由于第二个单元格，人工构建似乎与它的前一个单元和后续单元没有任何关系--三个单元格的输出是相同的。我认为手动代码是错误的，但我找不到如何连接BasicLSTMCell的.帮助！

tensorflow

lstm

回答 1

Stack Overflow用户

回答已采纳

发布于 2021-05-20 08:26:22

感谢@Susmit Agrawal，我将代码更改为：

    for i in range(num_steps):
        if i > 0:
            output = lstm_cell(x[:, i, :], outputs[i-1][1])
        else:
            output = lstm_cell(x[:, i, :], z_state)
        outputs.append(output)

这将产生与static_rnn相同的正确输出。

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/67614261

复制

相似问题

问为什么构建timestep-手动展开LSTM与使用static_rnn有不同的输出？
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问为什么构建timestep-手动展开LSTM与使用static_rnn有不同的输出？EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问为什么构建timestep-手动展开LSTM与使用static_rnn有不同的输出？
EN