我想从flair图书馆用预先训练的flair英语模型对情感进行分类。我有大约9万条推特,我想对每件事进行分类。
问题是,在大约7小时内,这个天赋库就能做到这一点。比较NLP情感分类器或TextBlob可以在1分钟内完成这一任务。
我对这个问题的代码是:
def flair_sentiment(data, classifier):
"""
data : text sequence (pandas.Series)
classifier : pretrained flair classifier
"""
values = []
for Item in data:
tokenized = Sentence(Item)
classifier.predict(tokenized)
values.append(tokenized.labels[0].score)
return values
df['sentiment'] = flair_sentiment(df.tweets, classifier)发布于 2020-09-22 18:50:52
我认为您可以尝试以下步骤:
下面是使用批处理预测来分析来自tweet的情绪的代码。它还显示了两种情绪模型的运行时间。您可以看到,基于RNN的模型比默认模型要快得多。
from time import time
from flair.data import Sentence
from flair.models import TextClassifier
def flair_sentiment(texts, classifier):
sentences = [Sentence(text) for text in texts]
classifier.predict(sentences, mini_batch_size=32)
return [
(sent.labels[0].value, sent.labels[0].score)
for sent in sentences
]
for sentiment_model_name in ("sentiment", "sentiment-fast"):
classifier = TextClassifier.load(sentiment_model_name)
start_time = time()
tweets = 512 * [
"For what a beautiful day. #elated",
"It's broken"
]
sentiments = flair_sentiment(tweets, classifier)
# print(sentiments)
print(f"* Sentiment model {sentiment_model_name}: running time = {time() - start_time:.2f} second(s)")输出:
2020-09-22 11:50:14,027 loading file /Users/khuc/.flair/models/sentiment-en-mix-distillbert_3.1.pt
* Sentiment model sentiment: running time = 19.99 second(s)
2020-09-22 11:50:36,369 loading file /Users/khuc/.flair/models/sentiment-en-mix-ft-rnn.pt
* Sentiment model sentiment-fast: running time = 0.43 second(s)https://stackoverflow.com/questions/57831633
复制相似问题