文章/答案/技术大牛

发布

社区首页 >问答首页 >Python: LightGBM超参数调优值错误

问Python: LightGBM超参数调优值错误
EN

Stack Overflow用户

提问于 2021-04-08 15:09:48

回答 2查看 990关注 0票数 1

我编写了以下代码来在RandomizedSearchCV分类器模型上执行LightGBM，但是我得到了以下错误。

ValueError: For early stopping, at least one dataset and eval metric is required for evaluation

码

import lightgbm as lgb
fit_params={"early_stopping_rounds":30, 
            "eval_metric" : 'f1', 
            "eval_set" : [(X_val,y_val)],
            'eval_names': ['valid'],
            'verbose': 100,
            # 'categorical_feature': 'auto'
            }

from scipy.stats import randint as sp_randint
from scipy.stats import uniform as sp_uniform
param_test ={'num_leaves': sp_randint(6, 50), 
             'min_child_samples': sp_randint(100, 500), 
             'min_child_weight': [1e-5, 1e-3, 1e-2, 1e-1, 1, 1e1, 1e2, 1e3, 1e4],
             'subsample': sp_uniform(loc=0.2, scale=0.8), 
             'colsample_bytree': sp_uniform(loc=0.4, scale=0.6),
             'reg_alpha': [0, 1e-1, 1, 2, 5, 7, 10, 50, 100],
             'reg_lambda': [0, 1e-1, 1, 5, 10, 20, 50, 100]}

n_HP_points_to_test = 100

from sklearn.model_selection import RandomizedSearchCV
#n_estimators is set to a "large value". The actual number of trees build will depend on early stopping and 5000 define only the absolute maximum
clf = lgb.LGBMClassifier(max_depth=-1, 
                         random_state=42, 
                         silent=True, 
                         metric='f1', 
                         n_jobs=4, 
                         n_estimators=5000,
                         )

gs = RandomizedSearchCV(
    estimator=clf, param_distributions=param_test, 
    n_iter=n_HP_points_to_test,
    scoring='f1',
    cv=3,
    refit=True,
    random_state=41,
    verbose=True)

gs.fit(X_trn, y_trn, **fit_params)
print('Best score reached: {} with params: {} '.format(gs.best_score_, gs.best_params_))

试用解决方案

我试图实现以下链接中给出的解决方案，但没有一个有效。怎么解决这个问题？

python

python-3.x

scikit-learn

classification

lightgbm

回答 2

Stack Overflow用户

发布于 2021-04-09 07:42:01

F1不是在LightGBM的内置度量中。您可以轻松地添加自定义eval_metric：

from sklearn.metrics import f1_score

def lightgbm_eval_metric_f1(preds, dtrain):
    target = dtrain.get_label()
    weight = dtrain.get_weight()

    unique_targets = np.unique(target)
    if len(unique_targets) > 2:
        cols = len(unique_targets)
        rows = int(preds.shape[0] / len(unique_targets))
        preds = np.reshape(preds, (rows, cols), order="F")

    return "f1", f1_score(target, preds, weight), True

关于优化，我宁愿在LightGBM (lightgbm.train)的Optuna框架中使用本机python，这个框架运行得很好。

Optuna框架：https://github.com/optuna/optuna

但是，优化LightGBM与Optuna的最简单方法是使用MLJAR AutoML (它有f1度量内置)。

automl = AutoML(
    mode="Optuna"
    algorithms=["LightGBM"],
    optuna_time_budget=600, # 10 minutes for tuning 
    eval_metric="f1"
)
automl.fit(X, y)

MLJAR AutoML框架：https://github.com/mljar/mljar-supervised

如果您想检查MLJAR中LightGBM+Optuna优化的详细信息，下面是代码https://github.com/mljar/mljar-supervised/blob/master/supervised/tuner/optuna/lightgbm.py

票数 1

Stack Overflow用户

发布于 2021-04-08 21:10:41

您的第三个链接(2020年2月)中的最后一条消息表明，如果度量不被识别，则会引发此错误，而且"f1"实际上不是LGBM的内建度量中的一个。要么使用它们的一个内置程序(但仍然可以使用F1作为超参数搜索的选择标准)，要么创建一个自定义度量(参见方法文档末尾的注释)。

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/67006876

复制

相似问题

问Python: LightGBM超参数调优值错误
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Python: LightGBM超参数调优值错误EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Python: LightGBM超参数调优值错误
EN