我试图在lightGBM (https://arxiv.org/pdf/1912.07753.pdf) (第5页)中实现基于本文的零膨胀日志正常损失函数。但是,无可否认,我只是不知道该怎么做。我不明白如何得到这个函数的梯度和恒心值,以便在LGBM中实现它,而且在过去,我从来不需要实现自定义丢失函数。
本文的作者已经开源了他们的代码,该函数可以在tensorflow (https://github.com/google/lifetime_value/blob/master/lifetime_value/zero_inflated_lognormal.py)中使用,但我无法将其转换成适合LightGBM中自定义丢失函数所需的参数。关于LGBM如何接受自定义丢失函数的一个示例-逻辑似然损失将写为:
def loglikelihood(preds, train_data):
labels = train_data.get_label()
preds = 1. / (1. + np.exp(-preds))
grad = preds - labels
hess = preds * (1. - preds)
return grad, hess类似地,我需要定义一个自定义eval度量来伴随它,例如:
def binary_error(preds, train_data):
labels = train_data.get_label()
preds = 1. / (1. + np.exp(-preds))
return 'error', np.mean(labels != (preds > 0.5)), False以上两个示例都摘自以下存储库:
非常感谢在这方面的任何帮助,特别是详细的指导,以帮助我学习如何自己做这件事。
根据LGBM关于定制损失职能的文件:
It should have the signature objective(y_true, y_pred) -> grad, hess or objective(y_true, y_pred, group) -> grad, hess:
y_true: numpy 1-D array of shape = [n_samples]
The target values.
y_pred: numpy 1-D array of shape = [n_samples] or numpy 2-D array of shape = [n_samples, n_classes] (for multi-class task)
The predicted values. Predicted values are returned before any transformation, e.g. they are raw margin instead of probability of positive class for binary task.
group: numpy 1-D array
Group/query data. Only used in the learning-to-rank task. sum(group) = n_samples. For example, if you have a 100-document dataset with group = [10, 20, 40, 10, 10, 10], that means that you have 6 groups, where the first 10 records are in the first group, records 11-30 are in the second group, records 31-70 are in the third group, etc.
grad: numpy 1-D array of shape = [n_samples] or numpy 2-D array of shape = [n_samples, n_classes] (for multi-class task)
The value of the first order derivative (gradient) of the loss with respect to the elements of y_pred for each sample point.
hess: numpy 1-D array of shape = [n_samples] or numpy 2-D array of shape = [n_samples, n_classes] (for multi-class task)
The value of the second order derivative (Hessian) of the loss with respect to the elements of y_pred for each sample point.发布于 2022-06-03 12:35:59
这是您定义的tensorflow实现的“转换”。大部分工作只是自己定义函数(如软加法、互熵等)。
链接纸中使用的是平均绝对百分比误差,不确定这是否是要使用的eval度量。
import math
import numpy as np
epsilon = 1e-7
def sigmoid(x):
return 1 / (1 + math.exp(-x))
def softplus(beta=1, threshold=20):
return 1 / beta* math.log(1 + math.exp(beta*x))
def BinaryCrossEntropy(y_true, y_pred):
y_pred = np.clip(y_pred, epsilon, 1 - epsilon)
term_0 = (1-y_true) * np.log(1-y_pred + epsilon)
term_1 = y_true * np.log(y_pred + epsilon)
return -np.mean(term_0+term_1, axis=0)
def zero_inflated_lognormal_pred(logits):
positive_probs = sigmoid(logits[..., :1])
loc = logits[..., 1:2]
scale = softplus(logits[..., 2:])
preds = (
positive_probs *
np.exp(loc + 0.5 * np.square(scale)))
return preds
def mean_abs_pct_error(preds, train_data):
labels = train_data.get_label()
decile_labels=np.percentile(labels,np.linspace(10,100,10))
decile_preds=np.percentile(preds,np.linspace(10,100,10))
MAPE = sum(np.absolute(decile_preds - decile_labels)/decile_labels)
return 'error', MAPE, False
def zero_inflated_lognormal_loss(train_data,
logits):
labels = train_data.get_label()
positive = labels > 0
positive_logits = logits[..., :1]
classification_loss = BinaryCrossEntropy(
y_true=positive, y_pred=positive_logits)
loc = logits[..., 1:2]
scale = math.maximum(
softplus(logits[..., 2:]),
math.sqrt(epsilon))
safe_labels = positive * labels + (
1 - positive) * np.ones(labels.shape)
regression_loss = -np.mean(
positive * np.LogNormal(mean=loc, stdev=scale).log_prob(safe_labels),
axis=-1)
return classification_loss + regression_losshttps://stackoverflow.com/questions/72381715
复制相似问题