定义模型：从keras.layers导入LSTM，从keras.models导入模型输入=输入(batch_shape=(32，10，1)) lstm_layer = LSTM(10，stateful=True)模型=模型( Input，lstm_layer) model.compile(optimizer="adam"，loss="mse") 首先建立和编译模型是很重要的，因为在编译过程中，初始状态是重置的。此外，您需要指定一个batch_shape，其中指定了batch_size，就像在这个场景中，我们的网络应该是stateful (通过设置stateful=True模式来完成)。
现在我们可以设置初始状态的值：导入numpy导入keras.backend作为K K.variable(value=numpy.random.normal(size=(32，10)( cell_states = K.variable(value=numpy.random.normal(size=(32，10)模型.layers1.state= hidden_states模型.layers1.state 1= cell_states 请注意，您需要将状态作为keras变量提供。states[0]保存隐藏状态，states[1]保存单元格状态。

希望这能有所帮助。

票数 21

Stack Overflow用户

发布于 2020-02-20 13:47:58

正如用于递归层(https://keras.io/layers/recurrent/)的Keras文档所述：

关于指定RNNs初始状态的说明通过使用关键字参数initial_state调用RNN层，可以象征性地指定RNN层的初始状态。initial_state的值应该是表示RNN层初始状态的张量或张量列表。您可以通过使用关键字参数reset_states调用states来数值地指定RNN层的初始状态。states的值应该是一个numpy数组或表示RNN层初始状态的numpy数组列表。

由于LSTM层有两种状态(隐藏状态和单元状态)，所以initial_state和states的值是两个张量的列表。

示例

无国籍LSTM

输入形状：(批处理，时间步骤，特性)= (1，10，1)

LSTM层中的单位数=8(即隐藏状态和单元状态的维数)

import tensorflow as tf
import numpy as np

inputs = np.random.random([1, 10, 1]).astype(np.float32)

lstm = tf.keras.layers.LSTM(8)

c_0 = tf.convert_to_tensor(np.random.random([1, 8]).astype(np.float32))
h_0 = tf.convert_to_tensor(np.random.random([1, 8]).astype(np.float32))

outputs = lstm(inputs, initial_state=[h_0, c_0])

有状态LSTM

输入形状：(批处理，时间步骤，特性)= (1，10，1)

LSTM层中的单位数=8(即隐藏状态和单元状态的维数)

注意，对于有状态的lstm，还需要指定batch_size。

import tensorflow as tf
import numpy as np
from pprint import pprint

inputs = np.random.random([1, 10, 1]).astype(np.float32)

lstm = tf.keras.layers.LSTM(8, stateful=True, batch_size=(1, 10, 1))

c_0 = tf.convert_to_tensor(np.random.random([1, 8]).astype(np.float32))
h_0 = tf.convert_to_tensor(np.random.random([1, 8]).astype(np.float32))

outputs = lstm(inputs, initial_state=[h_0, c_0])

对于一个重要的LSTM，状态不会在每个序列的末尾重置，我们可以注意到，在最后一步，该层的输出对应于隐藏状态(即lstm.states[0])：

>>> pprint(outputs)
<tf.Tensor: id=821, shape=(1, 8), dtype=float32, numpy=
array([[ 0.07119043,  0.07012419, -0.06118739, -0.11008392,  0.00573938,
        -0.05663438,  0.11196419,  0.02663924]], dtype=float32)>
>>>
>>> pprint(lstm.states)
[<tf.Variable 'lstm_1/Variable:0' shape=(1, 8) dtype=float32, numpy=
array([[ 0.07119043,  0.07012419, -0.06118739, -0.11008392,  0.00573938,
        -0.05663438,  0.11196419,  0.02663924]], dtype=float32)>,
 <tf.Variable 'lstm_1/Variable:0' shape=(1, 8) dtype=float32, numpy=
array([[ 0.14726108,  0.13584498, -0.12986949, -0.22309153,  0.0125412 ,
        -0.11446435,  0.22290672,  0.05397629]], dtype=float32)>]

调用reset_states()可以重置状态：

>>> lstm.reset_states()
>>> pprint(lstm.states)
[<tf.Variable 'lstm_1/Variable:0' shape=(1, 8) dtype=float32, numpy=array([[0., 0., 0., 0., 0., 0., 0., 0.]], dtype=float32)>,
 <tf.Variable 'lstm_1/Variable:0' shape=(1, 8) dtype=float32, numpy=array([[0., 0., 0., 0., 0., 0., 0., 0.]], dtype=float32)>]
>>>

或将它们设置为特定值：

>>> lstm.reset_states(states=[h_0, c_0])
>>> pprint(lstm.states)
[<tf.Variable 'lstm_1/Variable:0' shape=(1, 8) dtype=float32, numpy=
array([[0.59103394, 0.68249655, 0.04518601, 0.7800545 , 0.3799634 ,
        0.27347744, 0.54415804, 0.9889024 ]], dtype=float32)>,
 <tf.Variable 'lstm_1/Variable:0' shape=(1, 8) dtype=float32, numpy=
array([[0.43390197, 0.28252542, 0.27139077, 0.19655049, 0.7568088 ,
        0.05909375, 0.68569875, 0.19087408]], dtype=float32)>]
>>>
>>> pprint(h_0)
<tf.Tensor: id=422, shape=(1, 8), dtype=float32, numpy=
array([[0.59103394, 0.68249655, 0.04518601, 0.7800545 , 0.3799634 ,
        0.27347744, 0.54415804, 0.9889024 ]], dtype=float32)>
>>>
>>> pprint(c_0)
<tf.Tensor: id=421, shape=(1, 8), dtype=float32, numpy=
array([[0.43390197, 0.28252542, 0.27139077, 0.19655049, 0.7568088 ,
        0.05909375, 0.68569875, 0.19087408]], dtype=float32)>
>>>

票数 5

Stack Overflow用户

发布于 2019-08-19 00:23:06

我采用了这种方法，完全是为了我自己：

lstm_cell = LSTM(cell_num, return_state=True) 

output, h, c = lstm_cell(input, initial_state=[h_prev, c_prev])

票数 3

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/42415909

复制

相似问题

问初始化LSTM隐藏状态Tensorflow/Keras
EN

回答 4

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问初始化LSTM隐藏状态Tensorflow/KerasEN

回答 4

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问初始化LSTM隐藏状态Tensorflow/Keras
EN