问如何使H2OGridSearch在火花环境下H2OGradientBoostingEstimator可重复(可复制)？
EN

Stack Overflow用户

提问于 2018-10-02 23:50:35

回答 1查看 298关注 0票数 0

我使用下面的代码运行GBM在闪闪发光的水。我已经设置了种子和score_each_iteration，但是每次检查AUC时，它仍然产生不同的结果，即使设置了种子和score_each_iteration=True。

from h2o.grid.grid_search import H2OGridSearch
from h2o.estimators.gbm import H2OGradientBoostingEstimator

# initialize the estimator
gbm_cov = H2OGradientBoostingEstimator(sample_rate = 0.7, col_sample_rate = 0.7, ntrees = 1000, balance_classes=True , score_each_iteration=True, nfolds=5, seed = 1234)

# set up hyper parameter search space
gbm_hyper_params = {'learn_rate': [0.01, 0.015, 0.025, 0.05, 0.1],
                     'max_depth': [3, 5, 7, 9, 12],
                     #'sample_rate': [i * 0.1 for i in range(6, 11)],
                     #'col_sample_rate': [i * 0.1 for i in range(6, 11)],
                     #'ntrees': [i * 100 for i in range(1, 11)]
                }

# define Search criteria
gbm_search_criteria = {'strategy': "RandomDiscrete", 
                        'max_models': 10, 
                        'max_runtime_secs': 1800,
                        'stopping_metric': eval_metric, 
                        'stopping_tolerance': 0.001, 
                        'stopping_rounds': 3,
                        'seed': 1
                       }

# build grid search 
gbm_grid = H2OGridSearch(model = gbm_cov,
                     hyper_params = gbm_hyper_params,
                     search_criteria = gbm_search_criteria # we can use "Cartesian" if search space is small
                    )

# train using the grid
gbm_grid.train(x = top_feature, y = y, training_frame =htrain)

h2o

sparkling-water

回答 1

Stack Overflow用户

回答已采纳

发布于 2018-10-03 16:02:52

注释掉‘max_运行时_secs’：1800可以解决可重现性问题。我还发现了一件事，但我不知道为什么，如果我们将早期停止代码从搜索条件移到H2OGradientBoostingEstimator，代码将运行得更快。

'stopping_metric': eval_metric, 
'stopping_tolerance': 0.001, 
'stopping_rounds': 3,

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/52617898

复制

相似问题

问如何使H2OGridSearch在火花环境下H2OGradientBoostingEstimator可重复(可复制)？
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何使H2OGridSearch在火花环境下H2OGradientBoostingEstimator可重复(可复制)？EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何使H2OGridSearch在火花环境下H2OGradientBoostingEstimator可重复(可复制)？
EN