文章/答案/技术大牛

发布

社区首页 >问答首页 >推荐系统中NDCG的计算

问推荐系统中NDCG的计算
EN

Data Science用户

提问于 2019-11-24 05:11:41

回答 1查看 5.4K关注 0票数 2

这是一个关于NDCG的问题，NDCG是一个推荐的评估指标。

以下是建议的评价指标。

DCG = r_1 + \sum\limits_{i=2}^{N}\frac{r_i}{log_2i}

nDCG = \frac{DCG}{DCG_{perfect}}

守则如下：

def dcg_score (y_true, y_score, k = 20, gains = "exponential"):
    """Discounted cumulative gain (DCG) at rank k
    Parameters
    ----------
    y_true: array-like, shape = [n_samples]
        Ground truth (true relevance labels).
    y_score: array-like, shape = [n_samples]
        Predicted scores.
    k: int
        Rank.
    gains: str
        Whether gains should be "exponential" (default) or "linear".
    Returns
    -------
    DCG @k: float
    """
    order = np.argsort (y_score) [::-1]
    y_true = np.take (y_true, order [: k])

    if gains == "exponential":
        gains = 2 ** y_true-1
    elif gains == "linear":
        gains = y_true
    else:
        raise ValueError ("Invalid gains option.")

    # highest rank is 1 so +2 instead of +1
    discounts = np.log2 (np.arange (len (y_true)) + 2)
    return np.sum (gains / discounts)

def ndcg_score (y_true, y_score, k = 20, gains = "exponential"):
    """Normalized discounted cumulative gain (NDCG) at rank k
    Parameters
    ----------
    y_true: array-like, shape = [n_samples]
        Ground truth (true relevance labels).
    y_score: array-like, shape = [n_samples]
        Predicted scores.
    k: int
        Rank.
    gains: str
        Whether gains should be "exponential" (default) or "linear".
    Returns
    -------
    NDCG @k: float
    """
    best = dcg_score (y_true, y_true, k, gains)
    actual = dcg_score (y_true, y_score, k, gains)
    return actual / best

假设k= 5。

此时，NDCG应该如何计算无法在kth中推荐的项目？

例如,

y_true = [5,4,3,2,1]

y_score = [0,0,0,0,0] # 0 means we could not recommend within the top 5

在这个时候

>>> np.argsort ([0,0,0,0]) [::-1]
array ([3, 2, 1, 0])

所以，按照上面的代码，

NDCG @5= 1.0

这看起来很奇怪。

在这种情况下，得分是否应该是0，而不包括在NDCG的分数计算中？

如果你有什么推荐信的话，给我看就行了。

谢谢。

python

recommender-system

ndcg

回答 1

Data Science用户

回答已采纳

发布于 2020-06-23 20:13:51

国际水文学组织，

DCG的基本定义是，它是衡量排名质量的标准。这假设您已经计算了每个文档/项的实用程序，并按一定的顺序对它们进行了排序。

考虑到这个定义，如果您有n项具有相同的实用程序(在您的示例中为0)，则计算NDCG来度量该子集中的排序质量(因为您只查看第5、4、3、2和1项，所有这些都不建议使用)，将为您提供NDCG 1的分数--因为您的排名是完美的，如果您只查看这些项。

NDCG仅仅是一种量化订购质量的方法，即当前订单与完美订单(项目排序w.r.to它们的实用程序)。这是没有意义的，如果你只看相同的效用得分项目。

我希望这能回答你的问题。

票数 0

页面原文内容由Data Science提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://datascience.stackexchange.com/questions/63667

复制

相似问题

问推荐系统中NDCG的计算
EN

回答 1

Data Science用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问推荐系统中NDCG的计算EN

回答 1

Data Science用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问推荐系统中NDCG的计算
EN