首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >默认SimpleTransformers设置失败,并显示ValueError str

默认SimpleTransformers设置失败,并显示ValueError str
EN

Stack Overflow用户
提问于 2021-05-30 05:17:52
回答 1查看 147关注 0票数 2

我正在尝试使用SimpleTransformers默认设置进行多任务学习。

我使用的是他们网站here中的示例

代码如下:

代码语言:javascript
复制
import logging

import pandas as pd
from simpletransformers.t5 import T5Model, T5Args

logging.basicConfig(level=logging.INFO)
transformers_logger = logging.getLogger("transformers")
transformers_logger.setLevel(logging.WARNING)


train_data = [
    ["binary classification", "Anakin was Luke's father" , 1],
    ["binary classification", "Luke was a Sith Lord" , 0],
    ["generate question", "Star Wars is an American epic space-opera media franchise created by George Lucas, which began with the eponymous 1977 film and quickly became a worldwide pop-culture phenomenon", "Who created the Star Wars franchise?"],
    ["generate question", "Anakin was Luke's father" , "Who was Luke's father?"],
]
train_df = pd.DataFrame(train_data)
train_df.columns = ["prefix", "input_text", "target_text"]

eval_data = [
    ["binary classification", "Leia was Luke's sister" , 1],
    ["binary classification", "Han was a Sith Lord" , 0],
    ["generate question", "In 2020, the Star Wars franchise's total value was estimated at US$70 billion, and it is currently the fifth-highest-grossing media franchise of all time.", "What is the total value of the Star Wars franchise?"],
    ["generate question", "Leia was Luke's sister" , "Who was Luke's sister?"],
]
eval_df = pd.DataFrame(eval_data)
eval_df.columns = ["prefix", "input_text", "target_text"]

model_args = T5Args()
model_args.num_train_epochs = 200
model_args.no_save = True
model_args.evaluate_generated_text = False
model_args.evaluate_during_training = False
model_args.evaluate_during_training_verbose = False
model_args.use_multiprocessing = False
model_args.use_multiprocessing_for_evaluation = False

model = T5Model("t5", "t5-base", args=model_args)


def count_matches(labels, preds):
    print(labels)
    print(preds)
    return sum([1 if label == pred else 0 for label, pred in zip(labels, preds)])


model.train_model(train_df, show_running_loss=True)

我现在甚至没有使用eval_df (尽管我计划在我的真实代码中使用它),因为它在他们的代码中没有正确设置。在这个非常简单的设置中,我认为这个库可以正常工作。但是,在两个系统(一个是Windows,一个是Linux,都是最新版本的SimpleTransformers)上尝试之后,我得到了以下错误:

代码语言:javascript
复制
  File "C:\Users\name\AppData\Local\Programs\Python\Python38\lib\site-packages\simpletransformers\t5\t5_utils.py", line 175, in <listcomp>       
    preprocess_data(d) for d in tqdm(data, disable=args.silent)
  File "C:\Users\name\AppData\Local\Programs\Python\Python38\lib\site-packages\simpletransformers\t5\t5_utils.py", line 81, in preprocess_data   
    batch = tokenizer.prepare_seq2seq_batch(
  File "C:\Users\name\AppData\Local\Programs\Python\Python38\lib\site-packages\transformers\tokenization_utils_base.py", line 3282, in prepare_seq2seq_batch
    labels = self(
  File "C:\Users\name\AppData\Local\Programs\Python\Python38\lib\site-packages\transformers\tokenization_utils_base.py", line 2262, in __call__  
    raise ValueError(
ValueError: text input must of type `str` (single example), `List[str]` (batch or single pretokenized example) or `List[List[str]]` (batch of pretokenized examples).

我使用的是精确的设置,并且所有的输入DataFrames都有字符串。

有人能帮我找出这个基本设置失败的原因吗?谢谢。

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2021-05-31 01:54:13

在示例代码中,如果您更改

代码语言:javascript
复制
train_data = [
    ["binary classification", "Anakin was Luke's father" , 1],
    ["binary classification", "Luke was a Sith Lord" , 0],
    ["generate question", "Star Wars is an American epic space-opera media franchise created by George Lucas, which began with the eponymous 1977 film and quickly became a worldwide pop-culture phenomenon", "Who created the Star Wars franchise?"],
    ["generate question", "Anakin was Luke's father" , "Who was Luke's father?"],
]

代码语言:javascript
复制
train_data = [
    ["binary classification", "Anakin was Luke's father" , '1'],
    ["binary classification", "Luke was a Sith Lord" , '0'],
    ["generate question", "Star Wars is an American epic space-opera media franchise created by George Lucas, which began with the eponymous 1977 film and quickly became a worldwide pop-culture phenomenon", "Who created the Star Wars franchise?"],
    ["generate question", "Anakin was Luke's father" , "Who was Luke's father?"],
]

错误不再发生-所以它是由于标签不是str类型造成的。

票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/67755780

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档