目前,我正在尝试使用SMOTE进行过采样,然后在管道中运行我的XGBClassifier。由于某些原因,我不能让HyperOpt很好地使用管道。
下面两个示例都可以正常运行:
smote = SMOTE(random_state = 42)
model = XGBClassifier(random_state = 42)
pipe = Pipeline([('smote', smote),
('model',model)])
cv = StratifiedKFold(n_splits = 5)
score = cross_val_score(pipe, X_train, y_train, cv=cv, scoring='roc_auc', n_jobs=-1).mean()
print(score)model = XGBClassifier(random_state = 42)
def objective_pipe(params):
model.set_params(**params)
cv = StratifiedKFold(n_splits = 5)
score = cross_val_score(model, X_train, y_train, cv=cv, scoring='roc_auc', n_jobs=-1).mean()
return {'loss': -score, 'params':params, 'status':STATUS_OK}
trials = Trials()
best = fmin(fn=objective_pipe, space = params, algo=tpe.suggest, max_evals = 10, trials = trials, rstate=np.random.RandomState(42))然而,当我将管道放入目标函数中的时候,我最终得到了分数的NaN值。
smote = SMOTE(random_state = 42)
model = XGBClassifier(random_state = 42)
pipe = Pipeline([('smote', smote),
('model',model)])
def objective_pipe(params):
pipe.set_params(**params)
cv = StratifiedKFold(n_splits = 5)
score = cross_val_score(pipe, X_train, y_train, cv=cv, scoring='roc_auc', n_jobs=-1).mean()
return {'loss': -score, 'params':params, 'status':STATUS_OK}
trials = Trials()
best = fmin(fn=objective_pipe, space = params, algo=tpe.suggest, max_evals = 10, trials = trials, rstate=np.random.RandomState(42))也许我只是错过了一些非常简单的东西,但并不是真的确定如何解决这个问题。欢迎任何建议/帮助/资源。
发布于 2020-12-29 18:31:37
我不知道为什么,但我也遇到了类似的问题,通过设置njobs=1解决了这个问题,我想这和SMOTE不能并行运行有关。
https://stackoverflow.com/questions/62918134
复制相似问题