为了进行一些参数调整,我喜欢使用Keras循环一些训练函数。然而,我意识到当使用tensorflow.keras.metrics.AUC()作为度量时,对于每个训练循环,都会将一个整数添加到auc度量名称中(例如auc_1,auc_2,...)。因此,实际上keras指标是以某种方式存储的,即使在训练函数之外也是如此。
这首先会导致回调不再识别指标,也让我想知道是否没有存储其他东西,如模型权重。
我如何重置指标,是否需要重置keras存储的其他内容才能干净地重新启动训练?
下面您可以找到一个最小的工作示例:
编辑:此示例似乎只适用于tensorflow 2.2
import numpy as np
import tensorflow as tf
import tensorflow.keras as keras
from tensorflow.keras.metrics import AUC
def dummy_network(input_shape):
model = keras.Sequential()
model.add(keras.layers.Dense(10,
input_shape=input_shape,
activation=tf.nn.relu,
kernel_initializer='he_normal',
kernel_regularizer=keras.regularizers.l2(l=1e-3)))
model.add(keras.layers.Flatten())
model.add(keras.layers.Dense(11, activation='sigmoid'))
model.compile(optimizer='adagrad',
loss='binary_crossentropy',
metrics=[AUC()])
return model
def train():
CB_lr = tf.keras.callbacks.ReduceLROnPlateau(
monitor="val_auc",
patience=3,
verbose=1,
mode="max",
min_delta=0.0001,
min_lr=1e-6)
CB_es = tf.keras.callbacks.EarlyStopping(
monitor="val_auc",
min_delta=0.00001,
verbose=1,
patience=10,
mode="max",
restore_best_weights=True)
callbacks = [CB_lr, CB_es]
y = [np.ones((11, 1)) for _ in range(1000)]
x = [np.ones((37, 12, 1)) for _ in range(1000)]
dummy_dataset = tf.data.Dataset.from_tensor_slices((x, y)).batch(batch_size=100).repeat()
val_dataset = tf.data.Dataset.from_tensor_slices((x, y)).batch(batch_size=100).repeat()
model = dummy_network(input_shape=((37, 12, 1)))
model.fit(dummy_dataset, validation_data=val_dataset, epochs=2,
steps_per_epoch=len(x) // 100,
validation_steps=len(x) // 100, callbacks=callbacks)
for i in range(3):
print(f'\n\n **** Loop {i} **** \n\n')
train()输出为:
**** Loop 0 ****
2020-06-16 14:37:46.621264: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7f991e541f10 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-06-16 14:37:46.621296: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
Epoch 1/2
10/10 [==============================] - 0s 44ms/step - loss: 0.1295 - auc: 0.0000e+00 - val_loss: 0.0310 - val_auc: 0.0000e+00 - lr: 0.0010
Epoch 2/2
10/10 [==============================] - 0s 10ms/step - loss: 0.0262 - auc: 0.0000e+00 - val_loss: 0.0223 - val_auc: 0.0000e+00 - lr: 0.0010
**** Loop 1 ****
Epoch 1/2
10/10 [==============================] - ETA: 0s - loss: 0.4751 - auc_1: 0.0000e+00WARNING:tensorflow:Reduce LR on plateau conditioned on metric `val_auc` which is not available. Available metrics are: loss,auc_1,val_loss,val_auc_1,lr
WARNING:tensorflow:Early stopping conditioned on metric `val_auc` which is not available. Available metrics are: loss,auc_1,val_loss,val_auc_1,lr
10/10 [==============================] - 0s 36ms/step - loss: 0.4751 - auc_1: 0.0000e+00 - val_loss: 0.3137 - val_auc_1: 0.0000e+00 - lr: 0.0010
Epoch 2/2
10/10 [==============================] - ETA: 0s - loss: 0.2617 - auc_1: 0.0000e+00WARNING:tensorflow:Reduce LR on plateau conditioned on metric `val_auc` which is not available. Available metrics are: loss,auc_1,val_loss,val_auc_1,lr
WARNING:tensorflow:Early stopping conditioned on metric `val_auc` which is not available. Available metrics are: loss,auc_1,val_loss,val_auc_1,lr
10/10 [==============================] - 0s 10ms/step - loss: 0.2617 - auc_1: 0.0000e+00 - val_loss: 0.2137 - val_auc_1: 0.0000e+00 - lr: 0.0010
**** Loop 2 ****
Epoch 1/2
10/10 [==============================] - ETA: 0s - loss: 0.1948 - auc_2: 0.0000e+00WARNING:tensorflow:Reduce LR on plateau conditioned on metric `val_auc` which is not available. Available metrics are: loss,auc_2,val_loss,val_auc_2,lr
WARNING:tensorflow:Early stopping conditioned on metric `val_auc` which is not available. Available metrics are: loss,auc_2,val_loss,val_auc_2,lr
10/10 [==============================] - 0s 34ms/step - loss: 0.1948 - auc_2: 0.0000e+00 - val_loss: 0.0517 - val_auc_2: 0.0000e+00 - lr: 0.0010
Epoch 2/2
10/10 [==============================] - ETA: 0s - loss: 0.0445 - auc_2: 0.0000e+00WARNING:tensorflow:Reduce LR on plateau conditioned on metric `val_auc` which is not available. Available metrics are: loss,auc_2,val_loss,val_auc_2,lr
WARNING:tensorflow:Early stopping conditioned on metric `val_auc` which is not available. Available metrics are: loss,auc_2,val_loss,val_auc_2,lr
10/10 [==============================] - 0s 10ms/step - loss: 0.0445 - auc_2: 0.0000e+00 - val_loss: 0.0389 - val_auc_2: 0.0000e+00 - lr: 0.0010发布于 2020-06-16 21:04:25
对于我来说,您的可重现示例在几个地方失败了,所以我只做了一些修改(我使用的是TF 2.1)。在运行它之后,我能够通过指定metrics=[AUC(name='auc')]来去掉额外的指标名称。下面是完整的(固定的)可重现的例子:
import numpy as np
import tensorflow as tf
import tensorflow.keras as keras
from tensorflow.keras.metrics import AUC
def dummy_network(input_shape):
model = keras.Sequential()
model.add(keras.layers.Dense(10,
input_shape=input_shape,
activation=tf.nn.relu,
kernel_initializer='he_normal',
kernel_regularizer=keras.regularizers.l2(l=1e-3)))
model.add(keras.layers.Flatten())
model.add(keras.layers.Dense(11, activation='softmax'))
model.compile(optimizer='adagrad',
loss='binary_crossentropy',
metrics=[AUC(name='auc')])
return model
def train():
CB_lr = tf.keras.callbacks.ReduceLROnPlateau(
monitor="val_auc",
patience=3,
verbose=1,
mode="max",
min_delta=0.0001,
min_lr=1e-6)
CB_es = tf.keras.callbacks.EarlyStopping(
monitor="val_auc",
min_delta=0.00001,
verbose=1,
patience=10,
mode="max",
restore_best_weights=True)
callbacks = [CB_lr, CB_es]
y = tf.keras.utils.to_categorical([np.random.randint(0, 11) for _ in range(1000)])
x = [np.ones((37, 12, 1)) for _ in range(1000)]
dummy_dataset = tf.data.Dataset.from_tensor_slices((x, y)).batch(batch_size=100).repeat()
val_dataset = tf.data.Dataset.from_tensor_slices((x, y)).batch(batch_size=100).repeat()
model = dummy_network(input_shape=((37, 12, 1)))
model.fit(dummy_dataset, validation_data=val_dataset, epochs=2,
steps_per_epoch=len(x) // 100,
validation_steps=len(x) // 100, callbacks=callbacks)
for i in range(3):
print(f'\n\n **** Loop {i} **** \n\n')
train()Train for 10 steps, validate for 10 steps
Epoch 1/2
1/10 [==>...........................] - ETA: 6s - loss: 0.3426 - auc: 0.4530
7/10 [====================>.........] - ETA: 0s - loss: 0.3318 - auc: 0.4895
10/10 [==============================] - 1s 117ms/step - loss: 0.3301 -
auc: 0.4893 - val_loss: 0.3222 - val_auc: 0.5085这是因为在每个循环中,您通过执行以下操作创建了一个没有指定名称的新指标:metrics=[AUC()]。在循环的第一次迭代中,TF自动在名称空间中创建了一个名为auc的变量,但是在循环的第二次迭代中,名称'auc'已经被采用,因此TF将其命名为auc_1,因为您没有指定名称。但是,您的回调被设置为基于auc,这是此模型没有的度量(它是上一次循环中模型的度量)。因此,您可以执行name='auc'并覆盖以前的指标名称,或者在循环之外定义它,如下所示:
import numpy as np
import tensorflow as tf
import tensorflow.keras as keras
from tensorflow.keras.metrics import AUC
auc = AUC()
def dummy_network(input_shape):
model = keras.Sequential()
model.add(keras.layers.Dense(10,
input_shape=input_shape,
activation=tf.nn.relu,
kernel_initializer='he_normal',
kernel_regularizer=keras.regularizers.l2(l=1e-3)))
model.add(keras.layers.Flatten())
model.add(keras.layers.Dense(11, activation='softmax'))
model.compile(optimizer='adagrad',
loss='binary_crossentropy',
metrics=[auc])
return model而且不用担心keras会重新设置指标。它会在fit()方法中处理所有这些问题。如果你想要更多的灵活性和/或自己做,我建议使用自定义训练循环,并自己重置它:
auc = tf.keras.metrics.AUC()
auc.update_state(np.random.randint(0, 2, 10), np.random.randint(0, 2, 10))
print(auc.result())
auc.reset_states()
print(auc.result())Out[6]: <tf.Tensor: shape=(), dtype=float32, numpy=0.875> # state updatedOut[8]: <tf.Tensor: shape=(), dtype=float32, numpy=0.0> # state resethttps://stackoverflow.com/questions/62408749
复制相似问题