我有一个文本摘要项目。在这个项目中,我确保将数百篇文本按顺序进行总结。我还得到了这些摘要的Rouge分数。但是,我必须将Rouge分数保留在列表中,然后才能生成统计数据。我想不出该怎么做。你能帮帮我吗?
from rouge_score import rouge_scorer
scorer = rouge_scorer.RougeScorer(['rouge1'])
scorer.score(hyp,ref)
scores.append(scorer.score(hyp,ref))示例结果:
[{'rouge1': Score(precision=0.46017699115044247, recall=0.45217391304347826,
fmeasure=0.45614035087719296)},
{'rouge1': Score(precision=0.1693121693121693, recall=0.2831858407079646,
fmeasure=0.21192052980132448)}]当然,我不能直接访问结果。
发布于 2021-05-05 17:15:54
如果要直接访问分数对象,则应定义字典的键('rouge1')。
因此,scores.append(scorer.score(hyp,ref))将更改为scores.append(scorer.score(hyp,ref)['rouge1'])。
以下代码是一个更通用的版本,用于计算每个文档的ROUGE度量,并在单个字典中分别记住结果:
# importing the native rouge library
from rouge_score import rouge_scorer
# a list of the hypothesis documents
hyp = ['This is the first sample', 'This is another example']
# a list of the references documents
ref = ['This is the first sentence', 'It is one more sentence']
# make a RougeScorer object with rouge_types=['rouge1']
scorer = rouge_scorer.RougeScorer(['rouge1'])
# a dictionary that will contain the results
results = {'precision': [], 'recall': [], 'fmeasure': []}
# for each of the hypothesis and reference documents pair
for (h, r) in zip(hyp, ref):
# computing the ROUGE
score = scorer.score(h, r)
# separating the measurements
precision, recall, fmeasure = score['rouge1']
# add them to the proper list in the dictionary
results['precision'].append(precision)
results['recall'].append(recall)
results['fmeasure'].append(fmeasure)输出将如下所示:
{'fmeasure': [0.8000000000000002, 0.22222222222222224],
'precision': [0.8, 0.2],
'recall': [0.8, 0.25]}此外,我将建议rouge library,这是ROUGE paper的另一种实现。结果可能略有不同,但它将引入一些有用的功能,包括通过传入整个文本文档并计算所有文档的平均结果来计算rouge度量的可能性。
发布于 2021-12-31 08:12:34
您可以只保留F1 (fmeasure)分数作为统计信息。
scores = []
for (hyp, ref) in zip(hyps, refs):
scores.append(scorer.score(hyp,ref).fmeasure)https://stackoverflow.com/questions/67390427
复制相似问题