我使用google implementation code从头开始训练了一个electra模型。
python run_pretraining.py --data-dir gc://bucket-electra/dataset/ --model-name greek_electra --hparams hparams.json使用这个json hyperparams:
{
"embedding_size": 768,
"max_seq_length": 512,
"train_batch_size": 128,
"vocab_size": 100000,
"model_size": "base",
"num_train_steps": 1500000
}训练完模型后,我使用transformers库中的convert_electra_original_tf_checkpoint_to_pytorch.py脚本来转换检查点。
python convert_electra_original_tf_checkpoint_to_pytorch.py --tf_checkpoint_path output/models/transformer/greek_electra --config_file resources/hparams.json --pytorch_dump_path output/models/transformer/discriminator --discriminator_or_generator "discriminator"现在,我正在尝试加载模型:
from transformers import ElectraForPreTraining
model = ElectraForPreTraining.from_pretrained('discriminator')但我得到以下错误:
Traceback (most recent call last):
File "~/.local/lib/python3.9/site-packages/transformers/configuration_utils.py", line 427, in get_config_dict
config_dict = cls._dict_from_json_file(resolved_config_file)
File "~/.local/lib/python3.9/site-packages/transformers/configuration_utils.py", line 510, in _dict_from_json_file
text = reader.read()
File "/usr/lib/python3.9/codecs.py", line 322, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 64: invalid start byte你知道是什么导致了这个问题吗?如何解决它?
发布于 2021-05-28 23:14:27
看起来@npit是对的。convert_electra_original_tf_checkpoint_to_pytorch.py的输出不包含我给出的配置(hparams.json),因此我创建了一个ElectraConfig对象--具有相同的参数--并将其提供给from_pretrained函数。这就解决了问题。
https://stackoverflow.com/questions/67740498
复制相似问题