我一直试图在以下脚本中使用sklearn的GridSearchCV --多种类型(每次逻辑回归一次,总共3次)。最终发生的是,第一次逻辑回归的第一个GridSearch已经完成,当第二个网格搜索即将开始时,终端就会挂起,什么也不会发生。
我正在使用Keras进行logistic回归。
我希望得到一些反馈,因为这个问题有点烦人。
PS。这是我的第一次张贴,所以我很高兴提供更多的信息,如果需要。
下面是剧本:
def braf():
mut_pred=mutation_prediction(X_train_all_genes, Y_train_all_genes, X_valid=X_test_all_genes, Y_valid=Y_test_all_genes)
print('Starting BRAF...')
BRAF_history= History()
braf_estimator = KerasClassifier(build_fn=mut_pred.braf_model, epochs=30, batch_size=15, verbose=0)
braf_param_grid = dict(braf_learning_rate =
list(np.linspace(0,0.0001, num=5)), braf_lasso_rate =
list(np.linspace(0,0.0001, num=5)))
braf_grid = GridSearchCV(estimator=braf_estimator, cv=2,
param_grid=braf_param_grid, n_jobs=30,pre_dispatch=5)
braf_grid_result = braf_grid.fit(X_train_all_genes.values, Y_train_all_genes['BRAF_mutant'].values,callbacks=[BRAF_history])
print('Done with BRAF')
plot_loss(BRAF_history.history['loss'], title='BRAF LOSS')
plot_accuracy(BRAF_history.history['acc'], title='BRAF Accuracy')
BRAF_pred=list(map(lambda x:int(x),braf_grid.predict(X_test_all_genes.values)))
return BRAF_pred
def kras():
mut_pred=mutation_prediction(X_train_all_genes, Y_train_all_genes, X_valid=X_test_all_genes, Y_valid=Y_test_all_genes)
print('Starting KRAS...')
KRAS_history= History()
kras_estimator = KerasClassifier(build_fn=mut_pred.kras_model, epochs=30, batch_size=15, verbose=0)
kras_param_grid = dict(kras_learning_rate =
list(np.linspace(0,0.0001, num=10)), kras_lasso_rate =
list(np.linspace(0,0.0001, num=10)))
kras_grid = GridSearchCV(estimator=kras_estimator, cv=2, param_grid=kras_param_grid, n_jobs=30,pre_dispatch=5)
kras_grid_result = kras_grid.fit(X_train_all_genes.values, Y_train_all_genes['KRAS_mutant'].values,callbacks=[KRAS_history])
print('Done with KRAS')
plot_loss(KRAS_history.history['loss'], title='KRAS LOSS')
plot_accuracy(KRAS_history.history['acc'], title='KRAS Accuracy')
KRAS_pred=list(map(lambda x:int(x),kras_grid.predict(X_test_all_genes.values)))
return KRAS_pred
def tp53():
mut_pred=mutation_prediction(X_train_all_genes, Y_train_all_genes, X_valid=X_test_all_genes, Y_valid=Y_test_all_genes)
print('Starting TP53...')
TP53_history= History()
tp53_estimator = KerasClassifier(build_fn=mut_pred.tp53_model, epochs=30, batch_size=15, verbose=0)
tp53_param_grid = dict(tp53_learning_rate =
list(np.linspace(0,0.001, num=10)), tp53_lasso_rate =
list(np.linspace(0,0.0001, num=10)))
tp53_grid = GridSearchCV(estimator=tp53_estimator, cv=2, param_grid=tp53_param_grid, jobs=30,pre_dispatch=5)
tp53_grid_result = tp53_grid.fit(X_train_all_genes.values, Y_train_all_genes['TP53_mutant'].values,callbacks=[TP53_history])
print('Done with TP53')
plot_loss(TP53_history.history['loss'], title='TP53 LOSS')
plot_accuracy(TP53_history.history['acc'], title='TP53 Accuracy')
TP53_pred=list(map(lambda
x:int(x),tp53_grid.predict(X_test_all_genes.values)))在我的main()中,我调用上述函数对这三个基因进行LR,并以学习速率和lasso变量的最佳组合返回预测结果。
任何反馈都是有帮助的
UPDATE当我中断进程时,我得到以下内容:
Process ForkPoolWorker-58:
Process ForkPoolWorker-42:
Process ForkPoolWorker-56:
Process ForkPoolWorker-54:
Process ForkPoolWorker-52:
Process ForkPoolWorker-46:
Process ForkPoolWorker-40:
Process ForkPoolWorker-44:
Process ForkPoolWorker-38:
Process ForkPoolWorker-36:
Process ForkPoolWorker-60:
Process ForkPoolWorker-59:
Process ForkPoolWorker-43:
Process ForkPoolWorker-37:
Process ForkPoolWorker-39:
Process ForkPoolWorker-41:
Process ForkPoolWorker-45:
Process ForkPoolWorker-48:
Process ForkPoolWorker-53:
Process ForkPoolWorker-47:
Process ForkPoolWorker-57:
Process ForkPoolWorker-55:
Process ForkPoolWorker-49:
Process ForkPoolWorker-51:
Traceback (most recent call last):
File "FinalProjectV1.py", line 354, in <module>
main()
File "FinalProjectV1.py", line 332, in main
KRAS_pred_test=kras()
File "FinalProjectV1.py", line 309, in kras
kras_grid_result = kras_grid.fit(X_train_all_genes.values, Y_train_all_genes['KRAS_mutant'].values,callbacks=[KRAS_history])
File "/soe/ianastop/lib/python3.6/site- packages/sklearn/model_selection/_search.py", line 639, in fit
cv.split(X, y, groups)))
File "/soe/ianastop/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py", line 789, in __call__
self.retrieve()
File "/soe/ianastop/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py", line 699, in retrieve
self._output.extend(job.get(timeout=self.timeout))
File "/soe/ianastop/venv/lib/python3.6/multiprocessing/pool.py", line 638, in get
self.wait(timeout)
File "/soe/ianastop/venv/lib/python3.6/multiprocessing/pool.py", line 635, in wait
self._event.wait(timeout)
File "/soe/ianastop/venv/lib/python3.6/threading.py", line 551, in wait
signaled = self._cond.wait(timeout)
File "/soe/ianastop/venv/lib/python3.6/threading.py", line 295, in wait
waiter.acquire()
KeyboardInterrupt看起来这和多进程库有关吗?
发布于 2018-06-11 18:52:04
我可以在我的MacBook Pro上复制错误。
这里的问题是关于tensorflow会议。如果会话是在GridSearchCV.fit()之前在父进程中创建的,它肯定会挂起。
一种可能的解决方案是将所有会话创建代码限制在KerasClassifer类和模型创建函数中。
此外,您可能希望在模型创建函数或KerasClassifier的子类中限制TF的内存使用。
快速解决方案:
n_jobs = 1但这需要很长时间才能完成。
参考资料:
https://stackoverflow.com/questions/50803502
复制相似问题