文章/答案/技术大牛

发布

社区首页 >问答首页 >HuggingFace --为什么T5模型会缩短句子？

问HuggingFace --为什么T5模型会缩短句子？
EN

Stack Overflow用户

提问于 2022-07-06 11:27:25

回答 1查看 156关注 0票数 2

我想训练拼写修正的模型。我用波兰语训练了两种型号的快板/pt5-基，用英语训练了google/t5-v1-基。不幸的是，我不知道是什么原因，但这两种模式都缩短了句子。示例：

phrases = ['The name of the man who was kild was Jack Robbinson he has black hair brown eyes blue Jacket and blue Jeans.']
encoded = tokenizer(phrases, return_tensors="pt", padding=True, max_length=512, truncation=True)
print(encoded)
# {'input_ids': tensor([[   37,   564,    13,     8,   388,   113,    47,     3,   157,   173,
#             26,    47,  4496,  5376,  4517,   739,     3,    88,    65,  1001,
#           1268,  4216,  2053,  1692, 24412,    11,  1692,  3966,     7,     5,
#              1]], device='cuda:0'), 'attention_mask': tensor([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
#          1, 1, 1, 1, 1, 1, 1]], device='cuda:0')}

encoded.to('cuda')
translated = model.generate(**encoded)
print(translated)
# tensor([[   0,   37,  564,   13,    8,  388,  113,   47, 2170,   47, 4496, 5376,
#          4517,  739,    3,   88,   65, 1001, 1268, 4216]], device='cuda:0')

tokenizer.batch_decode(translated, skip_special_tokens=True)
#['The name of the man who was born was Jack Robbinson he has black hair brown']

像这样的事情几乎在每一句长句中都会发生。我试着根据文档：doc/t5.html来检查模型是否有最大的句子长度集。但是该模型的配置没有这样的字段：n_positions – The maximum sequence length that this model might ever be used with. Typically set this to something large just in case (e.g., 512 or 1024 or 2048). n_positions can also be accessed via the property max_position_embeddings. --这是模型的整个配置：

T5Config {
  "_name_or_path": "final_model_t5_800_000",
  "architectures": [
    "T5ForConditionalGeneration"
  ],
  "d_ff": 2048,
  "d_kv": 64,
  "d_model": 768,
  "decoder_start_token_id": 0,
  "dropout_rate": 0.1,
  "eos_token_id": 1,
  "feed_forward_proj": "gated-gelu",
  "initializer_factor": 1.0,
  "is_encoder_decoder": true,
  "layer_norm_epsilon": 1e-06,
  "model_type": "t5",
  "num_decoder_layers": 12,
  "num_heads": 12,
  "num_layers": 12,
  "output_past": true,
  "pad_token_id": 0,
  "relative_attention_max_distance": 128,
  "relative_attention_num_buckets": 32,
  "tie_word_embeddings": false,
  "torch_dtype": "float32",
  "transformers_version": "4.18.0",
  "use_cache": true,
  "vocab_size": 32128
}

怎样才能使模型返回完整的句子？

更新

我早些时候看过旧文档。但在新的一个，我没有看到一个字段在配置中的最大句子长度。新文件

python

huggingface-transformers

transformer-model

huggingface-tokenizers

huggingface

回答 1

Stack Overflow用户

发布于 2022-07-07 07:18:38

我已经设法解决了这个问题。当使用模型生成令牌时，必须添加max_length参数，如下所示：

translated = self._model.generate(**encoded, max_length=1024)

因此，该模型不再是截取句子。

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/72882799

复制

相似问题

问HuggingFace --为什么T5模型会缩短句子？
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问HuggingFace --为什么T5模型会缩短句子？EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问HuggingFace --为什么T5模型会缩短句子？
EN