我想计算BLEU_SCORE,但是它给了我不好的结果,我不知道为什么?
例如,这是我的参考和预测句子:
ref :
['electron', 'and', 'a', 'proton']
predicted :
['electron', 'and', 'a', 'proton']
ref :
['to', 'reach', 'the', 'nectar', 'at', 'the', 'bottom', 'of', 'flowers']
predicted :
['to', 'reach', 'the', 'nectar', 'at', 'the', 'bottom', 'of', 'flowers']
ref :
['during', 'the', 'summer', 'near', 'the', 'north', 'pole']
predicted :
['during', 'the', 'summer', 'near', 'the', 'north', 'pole']
ref :
['only', 'blue', 'light', 'is', 'reflected', 'by', 'the', 'block']
predicted :
['only', 'blue', 'light', 'is', 'reflected', 'by', 'the', 'block']
ref :
['between', '20', 'and', '40', 'degrees', 'latitude']
predicted :
['between', '20', 'and', '40', 'degrees', 'latitude']
ref :
['external', 'and', 'internal', 'combustion', 'engines']
predicted :
['external', 'and', 'internal', 'combustion', 'engines']
ref :
['cleaning', 'disinfecting', 'and', 'in', 'swimming', 'pools']
predicted :
['cleaning', 'disinfecting', 'and', 'in', 'swimming', 'pools']
ref :
['body', 'mass', 'index', 'bmi']
predicted :
['body', 'mass', 'index', 'bmi']
ref :
['they', 'put', 'nutrients', 'into', 'the', 'soil', 'that', 'plants', 'use', 'to', 'grow']
predicted :
['they', 'put', 'nutrients', 'into', 'the', 'soil', 'that', 'plants', 'use', 'to', 'grow']
ref :
['structure', 'of', 'earth', 'interior']
predicted :
['structure', 'of', 'earth', 'interior']下面是我用来计算BLEU_SCORE的代码:
from nltk.translate.bleu_score import corpus_bleu
print("Individual n-gram")
print("Individual 1-gram")
print('BLEU-1: %f' % corpus_bleu(ref, pre, weights=(1.0, 0, 0, 0)))
print("Individual 2-gram")
print('BLEU-2: %f' % corpus_bleu(ref, pre, weights=(0, 1.0, 0, 0)))
print("Individual 3-gram")
print('BLEU-3: %f' % corpus_bleu(ref, pre, weights=(0, 0, 1.0, 0)))
print("Individual 4-gram")
print('BLEU-4: %f' % corpus_bleu(ref, pre, weights=(0, 0, 0, 1.0)))Individual n-gram
Individual 1-gram
BLEU-1: 0.015625
Individual 2-gram
BLEU-2: 0.000000
Individual 3-gram
BLEU-3: 0.000000
Individual 4-gram
BLEU-4: 0.000000有人能帮忙吗?我不知道为什么它不能给我带来好的结果?
发布于 2020-05-20 17:45:10
您肯定收到了这个输出的警告。警告很大程度上告诉了你分数为0的原因。因为在你的例子中没有2克,3 -grams是重叠的.

这是详细的解释,我无法更好地解释它- https://github.com/nltk/nltk/issues/1838
编辑-解决-
虽然这个警告告诉了原因,但你可以这么做-
注意裁判和普雷,
from nltk.translate.bleu_score import corpus_bleu
ref =[[['electron', 'and', 'a', 'proton']]]
pre =[['electron', 'and', 'a', 'proton']]
print("Individual n-gram")
print("Individual 1-gram")
print('BLEU-1: %f' % corpus_bleu(ref, pre, weights=(1.0, 0, 0, 0)))
print("Individual 2-gram")
print('BLEU-2: %f' % corpus_bleu(ref, pre, weights=(0, 1.0, 0, 0)))
print("Individual 3-gram")
print('BLEU-3: %f' % corpus_bleu(ref, pre, weights=(0, 0, 1.0, 0)))
print("Individual 4-gram")
print('BLEU-4: %f' % corpus_bleu(ref, pre, weights=(0, 0, 0, 1.0)))
您可以参考python的help -

https://datascience.stackexchange.com/questions/74545
复制相似问题