我正在尝试复制维基百科的例子,以获得折扣累积收益。我可以用excel复制,但在python上有不同的结果。
我在这里使用了说明:https://www.geeksforgeeks.org/normalized-discounted-cumulative-gain-multilabel-ranking-metrics-ml/
我的代码:
# import required package
from sklearn.metrics import ndcg_score, dcg_score
import numpy as np
# Relevance scores in Ideal order
true_relevance = np.asarray([[3, 3, 2, 2, 1, 0]])
# Relevance scores in output order
relevance_score = np.asarray([[3, 2, 3, 0, 1, 2]])
# DCG score
dcg = dcg_score(true_relevance, relevance_score)
print("DCG score : ", dcg)
# IDCG score
idcg = dcg_score(true_relevance, true_relevance)
print("IDCG score : ", idcg)
# Normalized DCG score
ndcg = dcg / idcg
print("nDCG score : ", ndcg)输出:
DCG score : 6.57260640248932 #<- should be 6.861
IDCG score : 7.140995184095699 #<- this is OK
nDCG score : 0.9204048221636831 #<- should be 0.961知道出什么事了吗?
发布于 2021-09-03 08:57:13
简单地说,y_true是增益,y_predictes决定顺序。y_true可以是任何顺序。y_true将按y_score排序。检查此示例:
y_true = np.asarray([[2, 0, 4]])
y_score = np.asarray([[-33212424, -2, -1]])
pred_dcg = dcg_score(y_true, y_score)
ideal_dcg = dcg_score(y_true, y_true)
print(pred_dcg, ideal_dcg)对于pred_dcg,经过y_score排序(降序,更大的分数,更高的位置),我们得到:
[4,0,2]其中折扣为:
[log_2(2), log_2(3), log_2(4)] = [1, xx, 0.5]因此dcg得分将是:3
https://stackoverflow.com/questions/69023492
复制相似问题