首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >需要帮助在lightGBM中实现自定义丢失函数(零膨胀日志正常丢失)

需要帮助在lightGBM中实现自定义丢失函数(零膨胀日志正常丢失)
EN

Stack Overflow用户
提问于 2022-05-25 17:06:11
回答 1查看 1.1K关注 0票数 6

我试图在lightGBM (https://arxiv.org/pdf/1912.07753.pdf) (第5页)中实现基于本文的零膨胀日志正常损失函数。但是,无可否认,我只是不知道该怎么做。我不明白如何得到这个函数的梯度和恒心值,以便在LGBM中实现它,而且在过去,我从来不需要实现自定义丢失函数。

本文的作者已经开源了他们的代码,该函数可以在tensorflow (https://github.com/google/lifetime_value/blob/master/lifetime_value/zero_inflated_lognormal.py)中使用,但我无法将其转换成适合LightGBM中自定义丢失函数所需的参数。关于LGBM如何接受自定义丢失函数的一个示例-逻辑似然损失将写为:

代码语言:javascript
复制
def loglikelihood(preds, train_data):
    labels = train_data.get_label()
    preds = 1. / (1. + np.exp(-preds))
    grad = preds - labels
    hess = preds * (1. - preds)
    return grad, hess

类似地,我需要定义一个自定义eval度量来伴随它,例如:

代码语言:javascript
复制
def binary_error(preds, train_data):
    labels = train_data.get_label()
    preds = 1. / (1. + np.exp(-preds))
    return 'error', np.mean(labels != (preds > 0.5)), False

以上两个示例都摘自以下存储库:

https://github.com/microsoft/LightGBM/blob/e83042f20633d7f74dda0d18624721447a610c8b/examples/python-guide/advanced_example.py#L136

非常感谢在这方面的任何帮助,特别是详细的指导,以帮助我学习如何自己做这件事。

根据LGBM关于定制损失职能的文件:

代码语言:javascript
复制
It should have the signature objective(y_true, y_pred) -> grad, hess or objective(y_true, y_pred, group) -> grad, hess:

y_true: numpy 1-D array of shape = [n_samples]
The target values.

y_pred: numpy 1-D array of shape = [n_samples] or numpy 2-D array of shape = [n_samples, n_classes] (for multi-class task)
The predicted values. Predicted values are returned before any transformation, e.g. they are raw margin instead of probability of positive class for binary task.

group: numpy 1-D array
Group/query data. Only used in the learning-to-rank task. sum(group) = n_samples. For example, if you have a 100-document dataset with group = [10, 20, 40, 10, 10, 10], that means that you have 6 groups, where the first 10 records are in the first group, records 11-30 are in the second group, records 31-70 are in the third group, etc.

grad: numpy 1-D array of shape = [n_samples] or numpy 2-D array of shape = [n_samples, n_classes] (for multi-class task)
The value of the first order derivative (gradient) of the loss with respect to the elements of y_pred for each sample point.

hess: numpy 1-D array of shape = [n_samples] or numpy 2-D array of shape = [n_samples, n_classes] (for multi-class task)
The value of the second order derivative (Hessian) of the loss with respect to the elements of y_pred for each sample point.
EN

回答 1

Stack Overflow用户

发布于 2022-06-03 12:35:59

这是您定义的tensorflow实现的“转换”。大部分工作只是自己定义函数(如软加法、互熵等)。

链接纸中使用的是平均绝对百分比误差,不确定这是否是要使用的eval度量。

代码语言:javascript
复制
import math
import numpy as np
epsilon = 1e-7
def sigmoid(x):
  return 1 / (1 + math.exp(-x))

def softplus(beta=1, threshold=20):
  return 1 / beta* math.log(1 + math.exp(beta*x))

def BinaryCrossEntropy(y_true, y_pred):
    y_pred = np.clip(y_pred, epsilon, 1 - epsilon)
    term_0 = (1-y_true) * np.log(1-y_pred + epsilon)
    term_1 = y_true * np.log(y_pred + epsilon)
    return -np.mean(term_0+term_1, axis=0)

def zero_inflated_lognormal_pred(logits):
  positive_probs = sigmoid(logits[..., :1])
  loc = logits[..., 1:2]
  scale = softplus(logits[..., 2:])
  preds = (
      positive_probs *
      np.exp(loc + 0.5 * np.square(scale)))
  return preds

def mean_abs_pct_error(preds, train_data):
    labels = train_data.get_label()
    decile_labels=np.percentile(labels,np.linspace(10,100,10))
    decile_preds=np.percentile(preds,np.linspace(10,100,10))
    MAPE = sum(np.absolute(decile_preds - decile_labels)/decile_labels)
    return 'error', MAPE, False

def zero_inflated_lognormal_loss(train_data,
                                 logits):

  labels = train_data.get_label()
  positive = labels > 0

  positive_logits = logits[..., :1]
  classification_loss = BinaryCrossEntropy(
      y_true=positive, y_pred=positive_logits)

  loc = logits[..., 1:2]
  scale = math.maximum(
      softplus(logits[..., 2:]),
      math.sqrt(epsilon))
  safe_labels = positive * labels + (
      1 - positive) * np.ones(labels.shape)
  regression_loss = -np.mean(
      positive * np.LogNormal(mean=loc, stdev=scale).log_prob(safe_labels),
      axis=-1)
  return classification_loss + regression_loss
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/72381715

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档