文章/答案/技术大牛

发布

社区首页 >问答首页 >基于训练好的BERT模型、拥抱人脸的文本情感预测

问基于训练好的BERT模型、拥抱人脸的文本情感预测
EN

Stack Overflow用户

提问于 2021-11-03 06:07:05

回答 1查看 329关注 0票数 0

我用正面、负面和中性的类别来预测推文的情绪分析。我用拥抱脸训练了一个BERT模型。现在，我想要对未标记的Twitter文本的数据帧进行预测，但我遇到了困难。

我遵循了以下教程(https://curiousily.com/posts/sentiment-analysis-with-bert-and-hugging-face-using-pytorch-and-python/)，并能够使用Hugging训练BERT模型。

这是一个对原始文本进行预测的示例，但它只有一句话，我想使用一列Tweets。https://curiousily.com/posts/sentiment-analysis-with-bert-and-hugging-face-using-pytorch-and-python/#predicting-on-raw-text

review_text = "I love completing my todos! Best app ever!!!"

encoded_review = tokenizer.encode_plus(
  review_text,
  max_length=MAX_LEN,
  add_special_tokens=True,
  return_token_type_ids=False,
  pad_to_max_length=True,
  return_attention_mask=True,
  return_tensors='pt',
)

input_ids = encoded_review['input_ids'].to(device)
attention_mask = encoded_review['attention_mask'].to(device)
output = model(input_ids, attention_mask)
_, prediction = torch.max(output, dim=1)
print(f'Review text: {review_text}')
print(f'Sentiment  : {class_names[prediction]}')

Review text: I love completing my todos! Best app ever!!!
Sentiment  : positive

比尔的回应奏效了。这就是解决方案。

def predictionPipeline(text):
  encoded_review = tokenizer.encode_plus(
      text,
      max_length=MAX_LEN,
      add_special_tokens=True,
      return_token_type_ids=False,
      pad_to_max_length=True,
      return_attention_mask=True,
      return_tensors='pt',
    )

  input_ids = encoded_review['input_ids'].to(device)
  attention_mask = encoded_review['attention_mask'].to(device)

  output = model(input_ids, attention_mask)
  _, prediction = torch.max(output, dim=1)

  return(class_names[prediction])

df2['prediction']=df2['cleaned_tweet'].apply(predictionPipeline)

pytorch

sentiment-analysis

huggingface-transformers

pytorch-dataloader

回答 1

Stack Overflow用户

回答已采纳

发布于 2021-11-18 17:08:03

您可以使用相同的代码来预测来自dataframe列的文本。

model = ...
tokenizer = ...
    
def predict(review_text):
    encoded_review = tokenizer.encode_plus(
    review_text,
    max_length=MAX_LEN,
    add_special_tokens=True,
    return_token_type_ids=False,
    pad_to_max_length=True,
    return_attention_mask=True,
    return_tensors='pt',
    )

    input_ids = encoded_review['input_ids'].to(device)
    attention_mask = encoded_review['attention_mask'].to(device)
    output = model(input_ids, attention_mask)
    _, prediction = torch.max(output, dim=1)
    print(f'Review text: {review_text}')
    print(f'Sentiment  : {class_names[prediction]}')
    return class_names[prediction]


df = pd.DataFrame({
            'texts': ["text1", "text2", "...."]
        })

df_dataset["sentiments"] = df.apply(lambda l: predict(l.texts), axis=1)

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/69820318

复制

相似问题

问基于训练好的BERT模型、拥抱人脸的文本情感预测
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问基于训练好的BERT模型、拥抱人脸的文本情感预测EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问基于训练好的BERT模型、拥抱人脸的文本情感预测
EN