文章/答案/技术大牛

发布

社区首页 >问答首页 >子类GRUCell调用方法中的tf.keras.Model循环

问子类GRUCell调用方法中的tf.keras.Model循环
EN

Stack Overflow用户

提问于 2021-08-05 21:53:00

回答 1查看 211关注 0票数 0

我对tf.keras.Model进行了子类化，并在for循环中使用tf.keras.layers.GRUCell计算序列'y_t‘(n，timesteps，hidden_units)和最终隐藏状态'h_t’(n，hidden_units)。为了让我的循环输出'y_t'，我在循环的每一次迭代之后更新一个tf.Variable。用model(input)调用模型不是问题，但是当我用调用方法中的for循环对模型进行拟合时，就会得到一个TypeError或ValueError.。

请注意，我不能简单地使用tf.keras.layers.GRU，因为我正在尝试实现这个纸。本文不只是将x_t传递给RNN中的下一个单元，而是在for循环中作为一个步骤执行一些计算(它们在PyTorch中实现)，并将计算结果传递给RNN单元。他们最终基本上是这样做的: h_t = f(special_x_t，ht-1)。

请参阅下面导致错误的模型：

class CustomGruRNN(tf.keras.Model):
    def __init__(self, batch_size, timesteps, hidden_units, features, **kwargs):

        # Inheritance
        super().__init__(**kwargs)

        # Args
        self.batch_size = batch_size
        self.timesteps = timesteps
        self.hidden_units = hidden_units        

        # Stores y_t
        self.rnn_outputs = tf.Variable(tf.zeros(shape=(batch_size, timesteps, hidden_units)), trainable=False)

        # To be used in for loop in call
        self.gru_cell = tf.keras.layers.GRUCell(units=hidden_units)

        # Reshape to match input dimensions
        self.dense = tf.keras.layers.Dense(units=features)

    def call(self, inputs):
        """Inputs is rank-3 tensor of shape (n, timesteps, features) """

        # Initial state for gru cell
        h_t = tf.zeros(shape=(self.batch_size, self.hidden_units))

        for timestep in tf.range(self.timesteps):
            # Get the the timestep of the inputs
            x_t = tf.gather(inputs, timestep, axis=1)  # Same as x_t = inputs[:, timestep, :]

            # Compute outputs and hidden states
            y_t, h_t = self.gru_cell(x_t, h_t)
            
            # Update y_t at the t^th timestep
            self.rnn_outputs = self.rnn_outputs[:, timestep, :].assign(y_t)

        # Outputs need to have same last dimension as inputs
        outputs = self.dense(self.rnn_outputs)

        return outputs

一个会抛出错误的示例：

# Arbitrary values for dataset
num_samples = 128
batch_size = 4
timesteps = 5
features = 10

# Arbitrary dataset
x = tf.random.uniform(shape=(num_samples, timesteps, features))
y = tf.random.uniform(shape=(num_samples, timesteps, features))

train_data = tf.data.Dataset.from_tensor_slices((x, y))
train_data = train_data.shuffle(batch_size).batch(batch_size, drop_remainder=True)

# Model with arbitrary hidden units
model = CustomGruRNN(batch_size, timesteps, hidden_units=5)
model.compile(loss=tf.keras.losses.MeanSquaredError(), optimizer=tf.keras.optimizers.Adam())

热切地奔跑时：

model.fit(train_data, epochs=2, run_eagerly=True)

纪元1/2警告:tensorflow:当损失最小化时，变量‘堆栈_溢出_gru_rnn/gru_cell/内核:0’不存在渐变，'stack_overflow_gru_rnn/gru_cell/recurrent_kernel:0'，‘堆栈_溢出_gru_rnn/gru_cell/偏值:0’。ValueError:未找到子字符串ValueError

不急着跑的时候：

model.fit(train_data, epochs=2, run_eagerly=False)

时代1/2 TypeError:在用户代码: TypeError:无法将NoneType转换为张量或操作。

keras

recurrent-neural-network

python

tensorflow

machine-learning

回答 1

Stack Overflow用户

回答已采纳

发布于 2021-08-06 04:24:49

编辑

虽然TensorFlow指南的答案足够了，但我认为我自己回答的关于RNN自定义单元格的问题是一个更好的选择。请看这个答案。使用自定义RNN单元消除了使用tf.Transpose和tf.TensorArray的需要，从而降低了代码的复杂性，同时提高了可读性。

原始自答

使用在TensorFlow的DynamicRNN底部描述的有效TensorFlow2指南解决了我的问题。

为了简单地扩展DynamicRNN的概念用法，定义了一个RNN单元，在我的例子中定义了GRU，然后可以在tf.range循环中定义任意数量的自定义步骤。应该使用循环之外的tf.TensorArray对象来跟踪变量，但在调用方法本身内，这样的数组的大小可以通过简单地调用(输入)张量的.shape方法来确定。值得注意的是，DynamicRNN对象在模型匹配中工作，其中默认的执行模式是“图”模式，而不是较慢的“急切执行”模式。

最后，可能需要使用“DynamicRNN”，因为默认情况下，`tf.keras.layers.GRU‘计算是由以下递归逻辑(假设'f’定义了一个`tf.keras.layers.GRU单元格)松散描述的：

# Numpy is used here for ease of indexing, but in general you should use
# tensors and transpose them accordingly (see the previously linked guide)
inputs = np.random.randn((batch, total_timesteps, features))

# List for tracking outputs -- just for simple demonstration... again please see the guide for more details
outputs = []

# Initialize the 'hidden state' (often referred to as h_naught and denoted h_0) of the RNN cell
state_at_t_minus_1 = tf.zeros(shape=(batch, hidden_cell_units))

# Iterate through the input until all timesteps in the sequence have been 'seen' by the GRU cell function 'f'
for timestep_t in total_timesteps:
    # This is of shape (batch, features)
    input_at_t = inputs[:, timestep_t, :]

    # output_at_t of shape (batch, hidden_units_of_cell) and state_at_t (batch, hidden_units_of_cell)
    output_at_t, state_at_t = f(input_at_t, state_at_t_minus_1)
    outputs.append(output_at_t)

    # When the loop restarts, this variable will be used in the next GRU Cell function call 'f'
    state_at_t_minus_1 = state_at_t

您可能希望在递归逻辑的for循环中添加其他步骤(例如，密集层、其他层等)。修改传递给GRU单元函数'f‘的输入和状态。这是DynamicRNN的一个动机。

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/68673890

复制

相似问题

问子类GRUCell调用方法中的tf.keras.Model循环
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问子类GRUCell调用方法中的tf.keras.Model循环EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问子类GRUCell调用方法中的tf.keras.Model循环
EN