首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >HuggingFace / PipelineException:没有在输入中找到mask_token (<mask>)

HuggingFace / PipelineException:没有在输入中找到mask_token (<mask>)
EN

Stack Overflow用户
提问于 2022-01-05 14:37:46
回答 1查看 175关注 0票数 0

目标:通过多个for-loop通过modelsprint()经过的时间。

处理一个模型工作得很好:

代码语言:javascript
复制
i=0
start = time.time()
unmasker = pipeline('fill-mask', model=models[i])
unmasker("Hello I'm a [MASK] model.", top_k=1)
end = time.time() 
df = df.append({'Model': models[i], 'Time': end-start}, ignore_index=True)
代码语言:javascript
复制
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).

但是,迭代许多模型名称会导致标题错误。

代码:

代码语言:javascript
复制
from transformers import pipeline
import time

models = ['bert-base-uncased', 'roberta-base', 'distilbert-base-uncased', 'bert-base-cased', 'albert-base-v2', 'roberta-large', 'bert-large-uncased albert-large-v2', 'albert-base-v2', 'bert-large-cased', 'albert-base-v1', 'bert-large-cased-whole-word-masking', 'bert-large-uncased-whole-word-masking', 'albert-xxlarge-v2', 'google/bigbird-roberta-large', 'albert-xlarge-v2', 'albert-xxlarge-v1', 'facebook/muppet-roberta-large', 'facebook/muppet-roberta-base', 'albert-large-v1', 'albert-xlarge-v1']

for _model in models:
    start = time.time()
    unmasker = pipeline('fill-mask', model=_model)
    unmasker("Hello I'm a [MASK] model.", top_k=1)  # default: top_k=5
    end = time.time()

    print(end-start)
代码语言:javascript
复制
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
---------------------------------------------------------------------------
PipelineException                         Traceback (most recent call last)
<ipython-input-19-13b5f651657e> in <module>
      3     start = time.time()
      4     unmasker = pipeline('fill-mask', model=_model)
----> 5     unmasker("Hello I'm a [MASK] model.", top_k=1)  # default: top_k=5
      6     end = time.time()
      7 

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/transformers/pipelines/fill_mask.py in __call__(self, inputs, *args, **kwargs)
    224             - **token** (`str`) -- The predicted token (to replace the masked one).
    225         """
--> 226         outputs = super().__call__(inputs, **kwargs)
    227         if isinstance(inputs, list) and len(inputs) == 1:
    228             return outputs[0]

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/transformers/pipelines/base.py in __call__(self, inputs, num_workers, batch_size, *args, **kwargs)
   1099                 return self.iterate(inputs, preprocess_params, forward_params, postprocess_params)
   1100         else:
-> 1101             return self.run_single(inputs, preprocess_params, forward_params, postprocess_params)
   1102 
   1103     def run_multi(self, inputs, preprocess_params, forward_params, postprocess_params):

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/transformers/pipelines/base.py in run_single(self, inputs, preprocess_params, forward_params, postprocess_params)
   1105 
   1106     def run_single(self, inputs, preprocess_params, forward_params, postprocess_params):
-> 1107         model_inputs = self.preprocess(inputs, **preprocess_params)
   1108         model_outputs = self.forward(model_inputs, **forward_params)
   1109         outputs = self.postprocess(model_outputs, **postprocess_params)

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/transformers/pipelines/fill_mask.py in preprocess(self, inputs, return_tensors, **preprocess_parameters)
     82             return_tensors = self.framework
     83         model_inputs = self.tokenizer(inputs, return_tensors=return_tensors)
---> 84         self.ensure_exactly_one_mask_token(model_inputs)
     85         return model_inputs
     86 

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/transformers/pipelines/fill_mask.py in ensure_exactly_one_mask_token(self, model_inputs)
     76         else:
     77             for input_ids in model_inputs["input_ids"]:
---> 78                 self._ensure_exactly_one_mask_token(input_ids)
     79 
     80     def preprocess(self, inputs, return_tensors=None, **preprocess_parameters) -> Dict[str, GenericTensor]:

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/transformers/pipelines/fill_mask.py in _ensure_exactly_one_mask_token(self, input_ids)
     67                 "fill-mask",
     68                 self.model.base_model_prefix,
---> 69                 f"No mask_token ({self.tokenizer.mask_token}) found on the input",
     70             )
     71 

PipelineException: No mask_token (<mask>) found on the input

请让我知道,如果还有什么我可以补充的帖子,以澄清。

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2022-01-06 10:59:26

只有某些模型会抛出这个错误。

由于我正在试验任何模型的运行时,下面的内容就足够了。我成功地管理了大多数模特。

我应用了try except逻辑。注意,处理异常而不具体在except语句中命名错误被认为是错误的做法。

代码语言:javascript
复制
for _model in models:
    for i in range(10):
        start = time.time()
        try:
            unmasker = pipeline('fill-mask', model=_model)
            unmasker("Hello I'm a [MASK] model.", top_k=1)  # default: top_k=5
            print(_model)
        except: continue
        end = time.time()

        df = df.append({'Model': _model, 'Time': end-start}, ignore_index=True)
        print(df)
        df.to_csv('model_performance.csv', index=False)
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/70594724

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档