我希望有一个基于时间步长的自适应学习速度,而不是基于时间的学习速度,不像大多数调度程序是基于时间的。我有一个模特儿:
import tensorflow as tf
from tensorflow.keras.models import Model
from tensorflow.keras import layers
from tensorflow.keras.optimizers import Adam
class DQNagent:
def __init__(self, state_size, action_size):
self.state_size = state_size
self.action_size = action_size
self.model = self.build_model() # original model
self.target_model = self.build_model() # target model
self.lr = 1e-2
def build_model(self):
x_in = layers.Input(shape=(self.step_size, self.state_size))
x_out = layers.Dense(20, activation='relu')(x_in)
output = layers.Dense(self.action_size, activation='linear')(x_out)
self.learning_rate = CustomSchedule()
opt = tf.keras.optimizers.Adam(self.learning_rate)
model = Model(inputs=x_in, outputs=output_y, name="DQN")
model.compile(loss=['mse'], optimizer=opt)
return model我想让一个调度器像这样:
class CustomSchedule:
def __init__(self, lr=1e-2):
super(CustomSchedule, self).__init__()
self.lr = lr
self.t = 0
def __call__(self):
self.t +=1
if self.t % 100 ==0:
self.lr /= 10
return self.lr我的主要代码没有声明所有的东西,有这样的东西:
dqn = DQNagent(state_size, action_size)
for step in range(1000):
states_all = np.array([[[0, 0, 1],[1,0,1], [0, -1, 1], [1,-1,1]]])
Q_values = dqn.model.predict(state_all)[0]
# training
batch = memory.sample(batch_size)
batch_states = utils.get_states_user(batch) # assuming I have generated states using this method
Q_states = dqn.model.predict(batch_states) # assuming I have sampled batch states
dqn.model.fit(batch_states, Q_states, verbose =0)我想用这样的方式来安排我的学习速度,如果我说的是step%100==0,那么学习率就会以learning_rate/10的形式降低。对于我创建的CustomSchedule类来说,我将不得不重新编译model,这似乎不能有效地保存和加载权重。我还有别的办法可以做到吗?
编辑:
我将代码编辑为@FedericoMalerba 回答
创建了一个decay_func,如:
def decay_func(step, lr):
return lr/10**(step/100)然后,我向我的DQNAgent类添加了以下更改:
class DQNAgent():
def __init__(self, state_size, action_size):
self.lr = 1e-2
self.t_step = tf.Variable(0, trainable=False, name='Step', dtype=tf.int64)
self.decaying_lr = partial(decay_func, step=self.step, lr=self.lr)
def __call__(self):
self.step.assign_add(1)
return self.step并在我的主代码中为每一步调用dqn()。可调用的decaying_lr作为opt = tf.keras.optimizers.Adam(self.decaying_lr)传递给build_model()中的优化器。
发布于 2021-01-01 00:32:30
解决这一问题的一个一般方法是创建一个可调用(函数),该函数不带参数,并将其传递给您在DQNagent.build_model()中定义的Adam优化器。要做到这一点,请执行以下步骤:
def decay_func(step_tensor, **other_arguments_your_func_needs):
# body of the function. The input argument step tensor must be used
# to determine the learning rate that will be returned
return learning_ratestep = tf.Variable(0, trainable=False, name='Step', dtype=tf.int64)from functools import partial
decaying_learning_rate = partial(decay_func, step_tensor=step, **other_arguments_your_func_needs)opt = tf.keras.optimizers.Adam(decaying_learning_rate)step.assing_add(1)实际上,您要做的是创建一个可调用的decaying_learning_rate,它不需要任何参数,因为所有的参数都是由functools.partial调用给它的。tensorflow优化器将认识到学习速率不是一个数字,而是一个可调用的,并将按如下方式调用:
this_step_learning_rate = decaying_learning_rate()由于张量是跨运行时的共享对象,所以当您使用step.assing_add(1)增加步骤计数器时,这个新步骤将用于在优化器执行的下一次调用时计算decay_func中的新学习速率。即使没有显式传递新的和更新的张量,也会发生这种情况。魔法!
顺便说一句,这正是指数衰减所做的。我在这里介绍的唯一一件事是定义您自己的decay_func的一般方法,以及如何让它像TF预先实现的指数衰减一样工作。
发布于 2020-12-31 14:15:00
你要找的是指数衰减。您可以将此用作学习速率:
initial_learning_rate = 0.1
lr_schedule = tf.keras.optimizers.schedules.ExponentialDecay(
initial_learning_rate,
decay_steps=50,
decay_rate=0.1,
staircase=True)每50步,从学习速度0.1开始,学习率除以10。
这是一个工作培训脚本,我打印出了最终的学习速度。我做了一个定制的回调,这样它就会在每一批的末尾打印出学习速度。
import tensorflow as tf
from tensorflow import keras
import numpy as np
(xtrain, ytrain), _ = keras.datasets.mnist.load_data()
xtrain = np.float32(xtrain/255)
ytrain = np.int32(ytrain)
def pre_process(inputs, targets):
inputs = tf.expand_dims(inputs, -1)
targets = tf.one_hot(targets, depth=10)
return tf.divide(inputs, 255), targets
train_ds = tf.data.Dataset.from_tensor_slices((xtrain, ytrain)).\
take(10_000).shuffle(100).batch(8).map(pre_process)
from tensorflow.keras.layers import Dense, Conv2D, MaxPool2D, Dropout, Flatten
model = tf.keras.Sequential([
Conv2D(filters=16, kernel_size=(3, 3),
strides=(1, 1), input_shape=(28, 28, 1)),
MaxPool2D(pool_size=(2, 2)),
Conv2D(filters=32, kernel_size=(3, 3),
strides=(1, 1)),
MaxPool2D(pool_size=(2, 2)),
Flatten(),
Dense(64, activation='relu'),
Dropout(5e-1),
Dense(10)])
class PrintLRCallback(tf.keras.callbacks.Callback):
def on_batch_end(self, batch, logs=None):
print(model.optimizer.iterations.numpy(),
model.optimizer._decayed_lr(tf.float32).numpy())
initial_learning_rate = 0.1
lr_schedule = tf.keras.optimizers.schedules.ExponentialDecay(
initial_learning_rate,
decay_steps=50,
decay_rate=0.1,
staircase=True)
model.compile(loss='categorical_crossentropy',
optimizer=tf.keras.optimizers.Adam(learning_rate=lr_schedule))
history = model.fit(train_ds, verbose=0,
callbacks=[PrintLRCallback()],
steps_per_epoch=250)每10个迭代,我打印迭代和学习率:
10 0.1
20 0.1
30 0.1
40 0.1
50 0.01
60 0.01
70 0.01
80 0.01
90 0.01
100 0.001
110 0.001
120 0.001
130 0.001
140 0.001
150 0.0001
160 0.0001
170 0.0001
180 0.0001
190 0.0001 https://stackoverflow.com/questions/65521485
复制相似问题