我通过跟踪官方的TensorFlow站点这里,对文本预测进行了基本的训练。我在GTX1050ti上训练了我的模型多达40个时代,并将checkPoint文件保存在一个单独的文件夹中。然而,当我现在尝试恢复模型时,我得到了一个很长的错误:
StreamExecutor device (0): GeForce GTX 1050 Ti, Compute Capability 6.1
WARNING:tensorflow:Entity <function standard_gru at 0x7f9e121324d0> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <function standard_gru at 0x7f9e121324d0>: AttributeError: module 'gast' has no attribute 'Num'
WARNING:tensorflow:Entity <function cudnn_gru at 0x7f9e120c1d40> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <function cudnn_gru at 0x7f9e120c1d40>: AttributeError: module 'gast' has no attribute 'Num'
WARNING:tensorflow:Entity <function standard_gru at 0x7f9e121324d0> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <function standard_gru at 0x7f9e121324d0>: AttributeError: module 'gast' has no attribute 'Num'
WARNING:tensorflow:Entity <function cudnn_gru at 0x7f9e120c1d40> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <function cudnn_gru at 0x7f9e120c1d40>: AttributeError: module 'gast' has no attribute 'Num'
WARNING:tensorflow:From /home/awesome_ruler/.local/lib/python3.7/site-packages/tensorflow/python/training/tracking/util.py:1200: NameBasedSaverStatus.__init__ (from tensorflow.python.training.tracking.util) is deprecated and will be removed in a future version.
Instructions for updating:
Restoring a name-based tf.train.Saver checkpoint using the object-based restore API. This mode uses global names to match variables, and so is somewhat fragile. It also adds new restore ops to the graph each time it is called when graph building. Prefer re-encoding training checkpoints in the object-based format: run save() on the object-based saver (the same one this message is coming from) and use that checkpoint in the future.
Traceback (most recent call last):
File "main.py", line 95, in <module>
model.load_weights(checkpoint_dir)
File "/home/awesome_ruler/.local/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py", line 162, in load_weights
return super(Model, self).load_weights(filepath, by_name)
File "/home/awesome_ruler/.local/lib/python3.7/site-packages/tensorflow/python/keras/engine/network.py", line 1398, in load_weights
status.assert_nontrivial_match()
File "/home/awesome_ruler/.local/lib/python3.7/site-packages/tensorflow/python/training/tracking/util.py", line 917, in assert_nontrivial_match
return self.assert_consumed()
File "/home/awesome_ruler/.local/lib/python3.7/site-packages/tensorflow/python/training/tracking/util.py", line 894, in assert_consumed
(unused_attributes,))
AssertionError: Some objects had attributes which were not restored: {<tf.Variable 'embedding_1/embeddings:0' shape=(65, 256) dtype=float32, numpy=
array([[-0.00044268, -0.02351714, -0.01139065, ..., -0.00327835,
0.00074228, -0.00383734],
[-0.02313181, 0.04697707, -0.02350216, ..., 0.040385 ,
0.03087702, 0.02765551],
[ 0.0410727 , 0.00130001, 0.0051438 , ..., 0.02899202,
0.04258115, -0.03773504],
...,
[-0.03134514, 0.01370119, 0.00993627, ..., -0.02257681,
0.02617678, 0.03761976],
[-0.02954974, 0.02407967, 0.02768463, ..., -0.0056519 ,
-0.01507735, 0.04617763],
[-0.04113789, -0.03544737, 0.01056757, ..., 0.01236727,
-0.01791535, -0.01635399]], dtype=float32)>: ['embedding_1/embeddings'], <tf.Variable 'dense_1/kernel:0' shape=(1024, 65) dtype=float32, numpy=
array([[-6.7811467e-02, -2.5536597e-02, 5.1763237e-02, ...,
-6.9665730e-02, 3.9457709e-02, -5.3290475e-02],
[ 1.5835620e-02, -3.0763537e-02, -7.4058644e-02, ...,
3.8087368e-05, -9.1508478e-03, 5.5485427e-02],
[ 3.8143486e-02, 8.8131428e-04, -2.3478847e-02, ...,
-1.5135627e-02, -5.2146181e-02, 7.1185097e-02],
...,
[-6.6591002e-02, 4.7627889e-02, 5.7474524e-02, ...,
4.1528463e-02, 4.6467118e-02, -3.0670539e-02],
[-5.0804108e-02, 5.4505378e-02, -1.5776977e-03, ...,
2.1875933e-02, -2.9637258e-02, 2.0201296e-02],
[-4.7325939e-02, -8.0013275e-03, -3.6348965e-02, ...,
-7.0560835e-02, -4.9752403e-02, 1.0509960e-02]], dtype=float32)>: ['dense_1/kernel'], <tf.Variable 'dense_1/bias:0' shape=(65,) dtype=float32, numpy=
array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
dtype=float32)>: ['dense_1/bias'], <tf.Variable 'gru_1/kernel:0' shape=(256, 3072) dtype=float32, numpy=
array([[ 0.00432818, 0.03131782, 0.00038544, ..., -0.00559966,
0.03458985, -0.03219106],
[-0.00865119, 0.01648769, -0.00768028, ..., 0.01366192,
-0.03043955, -0.01382086],
[-0.01379537, 0.00547716, -0.00385967, ..., -0.00027269,
-0.01285852, 0.0377048 ],
...,
[-0.01940641, 0.01454895, 0.03349226, ..., -0.04234404,
-0.02699661, 0.0376601 ],
[ 0.00186675, -0.00547577, -0.02205843, ..., -0.01287581,
-0.02314153, 0.04158166],
[ 0.00954719, -0.02883693, -0.03259185, ..., -0.02587803,
0.02906795, -0.00559821]], dtype=float32)>: ['gru_1/kernel'], <tf.Variable 'gru_1/recurrent_kernel:0' shape=(1024, 3072) dtype=float32, numpy=
array([[ 9.11542401e-03, 1.50135346e-02, 2.96630897e-02, ...,
2.25223936e-02, 2.31253020e-02, -2.96920985e-02],
[-2.21075956e-02, -8.46013427e-06, -2.16848943e-02, ...,
-1.26914177e-02, -3.49153839e-02, -3.01396102e-02],
[-3.59148793e-02, 9.98445973e-03, 2.60963626e-02, ...,
3.15430500e-02, 1.28889643e-02, 3.37569825e-02],
...,
[ 3.39106433e-02, 6.54980540e-03, -1.27352085e-02, ...,
-4.14674729e-03, 3.53236459e-02, -1.36333425e-02],
[-3.50691415e-02, -1.76392253e-02, 1.67468414e-02, ...,
-2.06982102e-02, -1.06042419e-02, 2.26641595e-02],
[-1.14825107e-02, -3.46554294e-02, -1.83847174e-03, ...,
2.25809850e-02, 2.45791934e-02, -2.70933360e-02]], dtype=float32)>: ['gru_1/recurrent_kernel'], <tf.Variable 'gru_1/bias:0' shape=(2, 3072) dtype=float32, numpy=
array([[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.]], dtype=float32)>: ['gru_1/bias']}我正在尝试加载文件ckpt_40.index,如您所见,它是最新的检查点。但是我不能。我使用这段代码来加载我的模型==>
checkpoint_dir = 'CheckPoints/ckpt_40.index'
model = build_model(vocab_size, embedding_dim, rnn_units, batch_size=1)
model.load_weights(checkpoint_dir)
model.summary()我正在使用网站上的generate_text函数来预测一些事情。
我认为类似的问题是在论堆叠溢流上发布的,但没有得到回答。我正在使用TfGPU 2.0- beta1,这是最新的tf版本的GPU.
发布于 2020-03-31 11:27:27
我犯了一个非常愚蠢的错误,它太小了,我怀疑有人会捡到它。在这一行:-
checkpoint_dir = 'CheckPoints/ckpt_40.index'尽管该文件的命名具有“.index”前缀,但由于某种原因,将该扩展名附加到变量/调用函数中,会导致其出于某种原因(可能是错误)而陷入恐慌。更有帮助的是指出不正确扩展的错误。
因此,对于其他有此问题的人,只需将检查点目录更改为此===>即可。
checkpoint_dir = 'CheckPoints/ckpt_40 # .index has been removed'https://stackoverflow.com/questions/60948259
复制相似问题