在tensorflow中,使用slim.learning.train (TF0.11),我希望从检查点恢复一个模型并继续进行培训。该模型有一个成功的培训课程,我想微调它。但是,当我这样做时,TF崩溃时会出现错误Init operations did not make model ready.。
我是通过以下方式进行培训的:
tf.contrib.slim.learning.train(
train_op,
train_dir,
log_every_n_steps=FLAGS.log_every_n_steps,
graph=g,
global_step=model.global_step,
number_of_steps=FLAGS.number_of_steps,
init_fn=model.init_fn,
saver=model.saver,
session_config=session_config)我尝试了三种选择:
#1
跟随这位医生
model.init_fn = None#2
with g.as_default():
model_path = tf.train.latest_checkpoint(train_dir)
if model_path:
def restore_fn(sess):
tf.logging.info(
"Restoring SA&T variables from checkpoint file %s",
restore_fn.model_path)
model.saver.restore(sess, restore_fn.model_path)
restore_fn.model_path = model_path
model.init_fn = restore_fn
else:
model.init_fn = None#3
with g.as_default():
model_path = tf.train.latest_checkpoint(train_dir)
if model_path:
variables_to_restore = tf.contrib.slim.get_variables_to_restore()
model.init_fn = tensorflow.contrib.framework.assign_from_checkpoint_fn(
model_path, variables_to_restore)
else:
model.init_fn = None发布于 2016-10-19 22:43:54
问题解决了。这是因为模型建立后直接定义了保护程序(tf.train.Saver)。
相反,按照列车运筹学的定义定义它,解决了这个问题。
https://stackoverflow.com/questions/40128292
复制相似问题