文章/答案/技术大牛

发布

社区首页 >问答首页 >如何将训练模型的功能最小化？

问如何将训练模型的功能最小化？
EN

Data Science用户

提问于 2018-06-17 19:19:36

回答 1查看 81关注 0票数 2

我有真正的工艺流程，这是用复杂模型(xgboost)解释的。即产品的电流质量(y)取决于电流温度(x1)、压力(x2)等。我想解决优化的任务:哪一个功能的最小值可以被选择，一个产品的质量可以达到最大？它看起来像是一个简单的优化任务:x，y，0，y0，x，^2，其中y-方程的模型过程和y0-最大值或一些最接近最大值的。但是不可能得到xgboost的加权系数，所以我不能使用skopt，即使我能得到这些系数，实际的方程也是非常困难的。我现在唯一的决定就是为所有可能的特征排序所有可能的值，对这些特征进行预测并选择最优，如果y会达到最大值或接近它。你能给我个建议吗，我该怎么解决这个问题？

recommender-system

xgboost

optimization

回答 1

Data Science用户

回答已采纳

发布于 2018-06-22 12:35:08

有几种算法可以帮助您在一个聪明的方式。

通常，这些算法用于调优模型的超参数，因此您可以在教程/示例中找到这一点。在您的例子中，您必须找到一组很好的特性，而不是一组很好的超参数，但是原则是一样的。

我的建议：

1) SMAC。这是基于贝叶斯优化。这是一个迭代过程，在这个过程中构建代理函数并使其最大化：

要优化的函数(您的XGBoost模型)是在一个点(在特性的超空间中)求值的，优化器认为它可以找到最大值(或者，在第一次迭代时，在用户给定的点上)；
结果被添加到所有评价点的集合中，并用于构建代理函数；
代理函数被最大化，并且该最大值的坐标被认为是相同的，当原始函数也有一个最大值时。

这三个步骤你想重复多少就重复多少。所以，从第一步开始重复；

它既适用于连续特性，也适用于分类特性，而且您还可以在特性之间施加一些限制。

这里是Python (未经测试的代码)的示例：

from smac.configspace import ConfigurationSpace
from ConfigSpace.hyperparameters import UniformFloatHyperparameter, UniformIntegerHyperparameter
from smac.scenario.scenario import Scenario
from smac.facade.smac_facade import SMAC

#a continuous feature that you know has to lie in the [25 ~ 40] range
cont_feat = UniformFloatHyperparameter("a_cont_feature", 25., 40., default_value=35.)

#another continuous feature, [0.05 ~ 4] range
cont_feat2 = UniformFloatHyperparameter("another_cont_feature", 0.05, 4, default_value=1)


#a binary feature
bin_feat = UniformIntegerHyperparameter("a_bin_feature", 0, 1, default_value=1)

#the configuration space where to search for the maxima
cs = ConfigurationSpace()

cs.add_hyperparameters([cont_feat, cont_feat2, bin_feat])


# Scenario object
scenario = Scenario({"run_obj": "quality",   # we optimize quality
                     "runcount-limit": 1000,  # maximum function evaluations
                     "cs": cs,               # the configuration space
                     "cutoff_time": None
                     })

#here we include the XGBoost model
def f_to_opt(cfg):

    #here be careful! Your features need to be in the correct order for a correct evaluation of the XGB model
    features = {k : cfg[k] for k in cfg if cfg[k]}
    prediction = model.predict(features)

    return prediction


smac = SMAC(scenario=scenario, rng=np.random.RandomState(42),
        tae_runner=f_to_opt)
opt_feat_set = smac.optimize()

#the set of features which maximize the output
print (opt_feat_set)

2) dlib优化。这比以前的收敛速度要快得多。作为免责声明，我不得不说，这是一种算法，原则上只适用于实现某个标准的函数，而作为函数的XGBoost模型则不起作用。但在现实中，这个程序也适用于不那么严格的功能，至少在我尝试过的案例中是这样。所以也许你也想试试。

示例代码：

import dlib

#here we include the XGBoost model. Note that we cannot use categorical/integer/binary features
def f_to_opt(cont_feat, cont_feat2):
    return model.predict([cont_feat, cont_feat2])


x,y = dlib.find_max_global(holder_table, 
                           [25, 0.05],  # Lower bound constraints on cont_feat and cont_feat2 respectively
                           [40, 4],    # Upper bound constraints on cont_feat and cont_feat2 respectively
                           1000)         # The number of times find_max_global() will call  f_to_opt

票数 0

页面原文内容由Data Science提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://datascience.stackexchange.com/questions/33283

复制

相似问题

问如何将训练模型的功能最小化？
EN

回答 1

Data Science用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何将训练模型的功能最小化？EN

回答 1

Data Science用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何将训练模型的功能最小化？
EN