首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >GradientTape:求nan的梯度

GradientTape:求nan的梯度
EN

Stack Overflow用户
提问于 2021-12-03 11:26:24
回答 2查看 410关注 0票数 2

我正在尝试计算tensorflow中的梯度,但是返回None。我已经调整了类型为tensorflow.python.framework.ops.EagerTensor,但是他没有解决问题。

这是目前为止的代码:

代码语言:javascript
复制
accuracy = tf.keras.metrics.CategoricalAccuracy('accuracy')
loss = tf.keras.metrics.CategoricalCrossentropy('loss')
  
for epoch in range(epochs):
    accuracy.reset_states()
    loss.reset_states()
    
    for batch in iterate_minibatches(X_train, y_train, batch_size):
        imgs = batch[0]
        labels = batch[1]
        with tf.GradientTape() as tape:
            preds = model(imgs)
            
            labels = tf.convert_to_tensor(labels, dtype=tf.float32)
            
            #print(loss(labels,preds))
            # Loss is crossentropy loss with regularization term for each parameter
            total_loss = loss(labels, preds) #+l2_penalty(model, theta_A) 

        grads = tape.gradient(total_loss, model.trainable_variables)
        model.optimizer.apply_gradients(zip(grads, model.trainable_variables))
       
        accuracy.update_state(labels, preds)
        loss.update_state(labels, preds)
        print("\rEpoch: {}, Batch: {}, Loss: {:.3f}, Accuracy: {:.3f}".format(
            epoch+1, batch+1, loss.result().numpy(), accuracy.result().numpy()), flush=True, end='')
        print("")
   
print("Task B accuracy after training trained model on Task B: {}".format(model.evaluate(task_B_test)))
print("Task A accuracy after training trained model on Task B: {}".format(model.evaluate(task_A_test)))

有谁知道为什么它什么都没有,或者我怎么能解决这个问题?

编辑:我的错误消息如下所示:

C:\Users\DC5DE~1.ALB\AppData\Local\Temp/ipykernel_13300/818221091.py in 34 grads = tape.gradient(total_loss,model.trainable_variables) 35 --> 36 model.optimizer.apply_gradients(zip(grads,model.trainable_variables)) 37 38 accuracy.update_state(标签,preds) AttributeError:'NoneType‘对象没有属性'apply_gradients’

由于我不确定它是否与我如何将图像数据传递给GradientTape有关,这里是我对小型批处理的函数:

代码语言:javascript
复制
def iterate_minibatches(inputs, targets, batchsize, shuffle=False):
    assert inputs.shape[0] == targets.shape[0]
    if shuffle:
        indices = np.arange(inputs.shape[0])
        np.random.shuffle(indices)
    for start_idx in range(0, inputs.shape[0] - batchsize + 1, batchsize):
        if shuffle:
            excerpt = indices[start_idx:start_idx + batchsize]
        else:
            excerpt = slice(start_idx, start_idx + batchsize)
        yield inputs[excerpt], targets[excerpt]

另外:here也提到了类似的问题,但是没有任何可行的解决方案。

EN

回答 2

Stack Overflow用户

回答已采纳

发布于 2021-12-03 12:52:19

你搞混了几件事。您需要调用model.compile或定义自己的优化器并使用它。此外,您不应该混淆您的度量标准和损失函数。下面是一个有用的例子:

代码语言:javascript
复制
import tensorflow as tf

accuracy = tf.keras.metrics.CategoricalAccuracy('accuracy')
metric = tf.keras.metrics.CategoricalCrossentropy('metric_ categorical_crossentropy')
loss = tf.keras.losses.CategoricalCrossentropy(from_logits=True)
epochs = 2
model = tf.keras.Sequential([
    tf.keras.layers.Dense(units=3, input_shape=(1,))
]) 
optimizer = tf.keras.optimizers.Adam()

dataset = tf.data.Dataset.from_tensor_slices((tf.random.normal((50, 1)), tf.random.normal((50, 3)))).batch(5)
for epoch in range(epochs):
    accuracy.reset_states()
    metric.reset_states()
    
    for i, batch in enumerate(dataset):
        imgs = batch[0]
        labels = batch[1]
        print(imgs.shape, labels.shape)
        with tf.GradientTape() as tape:
            preds = model(imgs)
                      
            #print(loss(labels,preds))
            # Loss is crossentropy loss with regularization term for each parameter
            total_loss = loss(labels, preds) #+l2_penalty(model, theta_A) 

        grads = tape.gradient(total_loss, model.trainable_variables)
        optimizer.apply_gradients(zip(grads, model.trainable_variables))
       
        accuracy.update_state(labels, preds)
        metric.update_state(labels, preds)
        print("\rEpoch: {}, Batch: {}, Loss: {:.3f}, Accuracy: {:.3f}".format(
            epoch+1, i+1, metric.result().numpy(), accuracy.result().numpy()), flush=True, end='')
        print("")
代码语言:javascript
复制
Epoch: 1, Batch: 1, Loss: 4.209, Accuracy: 0.200
Epoch: 1, Batch: 2, Loss: 1.641, Accuracy: 0.400
Epoch: 1, Batch: 3, Loss: 1.294, Accuracy: 0.333
Epoch: 1, Batch: 4, Loss: 1.025, Accuracy: 0.300
Epoch: 1, Batch: 5, Loss: -0.110, Accuracy: 0.320
Epoch: 1, Batch: 6, Loss: 0.316, Accuracy: 0.267
Epoch: 1, Batch: 7, Loss: -0.118, Accuracy: 0.257
Epoch: 1, Batch: 8, Loss: -0.284, Accuracy: 0.225
Epoch: 1, Batch: 9, Loss: -0.249, Accuracy: 0.244
Epoch: 1, Batch: 10, Loss: -0.464, Accuracy: 0.260
Epoch: 2, Batch: 1, Loss: 4.468, Accuracy: 0.200
Epoch: 2, Batch: 2, Loss: 1.578, Accuracy: 0.400
Epoch: 2, Batch: 3, Loss: 1.012, Accuracy: 0.400
Epoch: 2, Batch: 4, Loss: 0.836, Accuracy: 0.350
Epoch: 2, Batch: 5, Loss: -0.294, Accuracy: 0.360
Epoch: 2, Batch: 6, Loss: 0.168, Accuracy: 0.300
Epoch: 2, Batch: 7, Loss: -0.201, Accuracy: 0.286
Epoch: 2, Batch: 8, Loss: -0.634, Accuracy: 0.250
Epoch: 2, Batch: 9, Loss: -0.552, Accuracy: 0.267
Epoch: 2, Batch: 10, Loss: -0.920, Accuracy: 0.280
票数 1
EN

Stack Overflow用户

发布于 2021-12-03 12:14:23

您需要使用tf.keras.losses.CategoricalCrossentropy进行损失计算,而不是tf.keras.metrics.CategoricalCrossentropy,后者的工作方式不同,并且将停止梯度传播。

票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/70213587

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档