我需要使用具有不同大小的元素的批处理,所以我尝试创建一个个性化的培训循环,主要思想是从从keras提供的内容开始:
for epoch in range(epochs):
for step, (x_batch_train, y_batch_train) in enumerate(train_dataset):
with tf.GradientTape() as tape:
logits = model(x_batch_train, training=True)
loss_value = loss_fn(y_batch_train, logits)
grads = tape.gradient(loss_value, model.trainable_weights)
optimizer.apply_gradients(zip(grads, model.trainable_weights))并且在批次的大小上添加了一个圆珠笔,这样我就可以一次训练网络一个元素,并且只在批次的每一个元素通过纽托克之后才更新权重。类似于:
for epoch in range(epochs):
for training in range(trainingsize):
for batch in range(batchsize):
with tf.GradientTape() as tape:
logits = model(x, training=True) # Logits for this minibatch
loss_value = loss_fn(, logits)
grads = tape.gradient(loss_value, model.trainable_weights)
optimizer.apply_gradients(zip(grads, model.trainable_weights))其中x和y是批处理的单个元素。
但我注意到,以这种方式我只考虑最后一批(因为毕业生是卵子写的)。
我该怎么处理呢?我不知道如何“融合”不同的毕业生。
还有一个好奇:我认为用" with“语句创建的变量只在语句中有效,那么如何可能在外部使用它呢?
更新
我尝试了SoheilStar的解决方案,但是tape.gradient返回向量None,None,在"apply_gradients“上,它说:”没有为任何变量提供梯度:‘卷积1d/内核:0’,‘稠密/内核:0’,‘稠密/偏差:0’。“
我不知道在这种情况下如何调试才能找到问题
我使用的代码的主要部分是:
optimizer = keras.optimizers.Adam( learning_rate=0.001,name="Adam")
loss_fn = keras.losses.CategoricalCrossentropy(from_logits=True)
model = keras.Sequential()
model.add(Conv1D(2, ksize, activation='relu', input_shape=ishape))
model.add(GlobalMaxPooling1D(data_format="channels_last"))
model.add(Dense(2, activation='sigmoid'))
for epoch in range(epochsize):
batchp=1
for k in range(trainingsize):
loss_value = tf.constant(0.)
mini_batch_losses=[]
for s in range(batchsize):
X_train, y_train = loadvalue(batchp) #caricamento elementi
with tf.GradientTape() as tape:
logits = model(X_train , training=True)
loss_value = loss_fn(y_train, logits)
mini_batch_losses.append(loss_value)
batchp += 1
loss_avg = tf.reduce_mean(mini_batch_losses)
grads = tape.gradient(loss_avg, model.trainable_weights)
optimizer.apply_gradients(grads_and_vars=zip(grads, model.trainable_weights))更新2:我注意到,如果我以这种方式改变训练冰柱,它是有效的,但我不明白为什么,如果它是正确的:
for epoch in range(epochsize):
batchp=1
for k in range(trainingsize):
loss_value = tf.constant(0.)
mini_batch_losses=[]
with tf.GradientTape() as tape:
for s in range(batchsize):
X_train, y_train = loadvalue(batchp)
logits = model(X_train , training=True)
tape.watch(X_train)
loss_value = loss_fn(y_train, logits)
mini_batch_losses.append(loss_value)
batchp += 1
loss_avg = tf.reduce_mean(mini_batch_losses)
grads = tape.gradient(loss_avg, model.trainable_weights)
optimizer.apply_gradients(grads_and_vars=zip(grads, model.trainable_weights))发布于 2021-05-05 16:47:20
梯度变量只包含变量的梯度。要应用它们,需要在最后一个For循环中移动优化器。但是为什么不编写一个正常的训练循环,然后将batch_size设置为一个循环呢?
======更新
您可以在最后一个for循环中计算每个样本的损失,然后执行reduce_mean计算损失的平均值,然后计算梯度。代码更新了。
for epoch in range(epochs):
for training in range(trainingsize):
mini_batch_losses = []
for batch in range(batchsize):
with tf.GradientTape() as tape:
logits = model(x, training=True) # Logits for this minibatch
loss_value = loss_fn(y_true, logits)
mini_batch_losses.append(loss_value)
loss_avg = tf.reduce_mean(mini_batch_losses)
grads = tape.gradient(loss_avg , model.trainable_weights)
optimizer.apply_gradients(zip(grads, model.trainable_weights))https://stackoverflow.com/questions/67405427
复制相似问题