文章/答案/技术大牛

发布

社区首页 >问答首页 >NLTK sentence_bleu方法7的得分高于1

问NLTK sentence_bleu方法7的得分高于1
EN

Stack Overflow用户

提问于 2019-06-15 20:37:28

回答 1查看 505关注 0票数 1

当将NLTK sentence_bleu函数与SmoothingFunction方法7结合使用时，最高分数为1.1167470964180197。而BLEU的得分被定义为介于0和1之间。

此分数显示为与引用完全匹配。我使用方法7，因为我并不总是有长度为4的句子，有些句子可能更短。使用方法5会得到相同的结果。其他方法会给出1.0分作为满分。

当我使用单个引用和候选者时，就会出现这种情况，例如：

from nltk.translate.bleu_score import sentence_bleu, SmoothingFunction
cc = SmoothingFunction()
reference = ['overofficious 98461 54363 39016 78223 52180']
candidate = 'overofficious 98461 54363 39016 78223 52180'
sentence_bleu(reference, candidate, smoothing_function=cc.method7)

这给出了分数：1.1167470964180197

我是不是做错了什么，这是预期的行为还是平滑函数的实现中存在错误？

nltk

bleu

回答 1

Stack Overflow用户

回答已采纳

发布于 2019-08-27 05:08:57

看起来这个实现至少与Chen和Cherry，2014年是一致的。他们建议对n-1, n, n+1 -gram计数取平均值。还将m0_prime定义为m1 + 1 (因此，在我们的示例中，它将是2，这破坏了我们的计算)。

我使用的是here的method5 (由method7使用)。

cc = SmoothingFunction()
references = ['overofficious 98461 54363 39016 78223 52180'.split()]
candidate = 'overofficious 98461 54363 39016 78223 52180'.split()
p_n = [Fraction(1, 1)] * 4
p_n5 = cc.method5(p_n, references, candidate, len(candidate))

输出：

[Fraction(4, 3), Fraction(10, 9), Fraction(28, 27), Fraction(82, 81)]

我们可以这样计算4/3：(2 + 1 + 1) / 3；10/9 = (4/3 + 1 + 1) / 3等等。

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/56610465

复制

相似问题

问NLTK sentence_bleu方法7的得分高于1
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问NLTK sentence_bleu方法7的得分高于1EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问NLTK sentence_bleu方法7的得分高于1
EN