首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >在脚本中调用多个类型时,GridSearchCV挂起

在脚本中调用多个类型时,GridSearchCV挂起
EN

Stack Overflow用户
提问于 2018-06-11 17:54:27
回答 1查看 700关注 0票数 1

我一直试图在以下脚本中使用sklearn的GridSearchCV --多种类型(每次逻辑回归一次,总共3次)。最终发生的是,第一次逻辑回归的第一个GridSearch已经完成,当第二个网格搜索即将开始时,终端就会挂起,什么也不会发生。

我正在使用Keras进行logistic回归。

我希望得到一些反馈,因为这个问题有点烦人。

PS。这是我的第一次张贴,所以我很高兴提供更多的信息,如果需要。

下面是剧本:

代码语言:javascript
复制
def braf():

    mut_pred=mutation_prediction(X_train_all_genes, Y_train_all_genes, X_valid=X_test_all_genes, Y_valid=Y_test_all_genes)

    print('Starting BRAF...')
    BRAF_history= History()
    braf_estimator = KerasClassifier(build_fn=mut_pred.braf_model, epochs=30, batch_size=15, verbose=0)
    braf_param_grid = dict(braf_learning_rate = 
    list(np.linspace(0,0.0001, num=5)), braf_lasso_rate = 
    list(np.linspace(0,0.0001, num=5)))
    braf_grid = GridSearchCV(estimator=braf_estimator, cv=2, 
    param_grid=braf_param_grid, n_jobs=30,pre_dispatch=5)
    braf_grid_result = braf_grid.fit(X_train_all_genes.values, Y_train_all_genes['BRAF_mutant'].values,callbacks=[BRAF_history])
    print('Done with BRAF')
    plot_loss(BRAF_history.history['loss'], title='BRAF LOSS')
    plot_accuracy(BRAF_history.history['acc'], title='BRAF Accuracy')

    BRAF_pred=list(map(lambda x:int(x),braf_grid.predict(X_test_all_genes.values)))

    return BRAF_pred
def kras():
    mut_pred=mutation_prediction(X_train_all_genes, Y_train_all_genes, X_valid=X_test_all_genes, Y_valid=Y_test_all_genes)
    print('Starting KRAS...')
    KRAS_history= History()
    kras_estimator = KerasClassifier(build_fn=mut_pred.kras_model, epochs=30, batch_size=15, verbose=0)
    kras_param_grid = dict(kras_learning_rate = 
    list(np.linspace(0,0.0001, num=10)), kras_lasso_rate = 
    list(np.linspace(0,0.0001, num=10)))
    kras_grid = GridSearchCV(estimator=kras_estimator, cv=2, param_grid=kras_param_grid, n_jobs=30,pre_dispatch=5)
    kras_grid_result = kras_grid.fit(X_train_all_genes.values, Y_train_all_genes['KRAS_mutant'].values,callbacks=[KRAS_history])
    print('Done with KRAS')
    plot_loss(KRAS_history.history['loss'], title='KRAS LOSS')
    plot_accuracy(KRAS_history.history['acc'], title='KRAS Accuracy')

    KRAS_pred=list(map(lambda x:int(x),kras_grid.predict(X_test_all_genes.values)))
    return KRAS_pred
def tp53():
    mut_pred=mutation_prediction(X_train_all_genes, Y_train_all_genes, X_valid=X_test_all_genes, Y_valid=Y_test_all_genes)
    print('Starting TP53...')
    TP53_history= History()
    tp53_estimator = KerasClassifier(build_fn=mut_pred.tp53_model, epochs=30, batch_size=15, verbose=0)
    tp53_param_grid = dict(tp53_learning_rate = 
    list(np.linspace(0,0.001, num=10)), tp53_lasso_rate = 
    list(np.linspace(0,0.0001, num=10)))
    tp53_grid = GridSearchCV(estimator=tp53_estimator, cv=2, param_grid=tp53_param_grid, jobs=30,pre_dispatch=5)
    tp53_grid_result = tp53_grid.fit(X_train_all_genes.values, Y_train_all_genes['TP53_mutant'].values,callbacks=[TP53_history])
    print('Done with TP53')
    plot_loss(TP53_history.history['loss'], title='TP53 LOSS')
    plot_accuracy(TP53_history.history['acc'], title='TP53 Accuracy')

    TP53_pred=list(map(lambda 
    x:int(x),tp53_grid.predict(X_test_all_genes.values)))

在我的main()中,我调用上述函数对这三个基因进行LR,并以学习速率和lasso变量的最佳组合返回预测结果。

任何反馈都是有帮助的

UPDATE当我中断进程时,我得到以下内容:

代码语言:javascript
复制
Process ForkPoolWorker-58:
Process ForkPoolWorker-42:
Process ForkPoolWorker-56:
Process ForkPoolWorker-54:
Process ForkPoolWorker-52:
Process ForkPoolWorker-46:
Process ForkPoolWorker-40:
Process ForkPoolWorker-44:
Process ForkPoolWorker-38:
Process ForkPoolWorker-36:
Process ForkPoolWorker-60:
Process ForkPoolWorker-59:
Process ForkPoolWorker-43:
Process ForkPoolWorker-37:
Process ForkPoolWorker-39:
Process ForkPoolWorker-41:
Process ForkPoolWorker-45:
Process ForkPoolWorker-48:
Process ForkPoolWorker-53:
Process ForkPoolWorker-47:
Process ForkPoolWorker-57:
Process ForkPoolWorker-55:
Process ForkPoolWorker-49:
Process ForkPoolWorker-51:
Traceback (most recent call last):
  File "FinalProjectV1.py", line 354, in <module>
    main()
  File "FinalProjectV1.py", line 332, in main
    KRAS_pred_test=kras()
  File "FinalProjectV1.py", line 309, in kras
    kras_grid_result = kras_grid.fit(X_train_all_genes.values,         Y_train_all_genes['KRAS_mutant'].values,callbacks=[KRAS_history])
  File "/soe/ianastop/lib/python3.6/site-        packages/sklearn/model_selection/_search.py", line 639, in fit
    cv.split(X, y, groups)))
  File "/soe/ianastop/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py", line 789, in __call__
    self.retrieve()
  File "/soe/ianastop/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py", line 699, in retrieve
    self._output.extend(job.get(timeout=self.timeout))
  File "/soe/ianastop/venv/lib/python3.6/multiprocessing/pool.py", line 638, in get
    self.wait(timeout)
  File "/soe/ianastop/venv/lib/python3.6/multiprocessing/pool.py", line 635, in wait
    self._event.wait(timeout)
  File "/soe/ianastop/venv/lib/python3.6/threading.py", line 551, in wait
    signaled = self._cond.wait(timeout)
  File "/soe/ianastop/venv/lib/python3.6/threading.py", line 295, in wait
    waiter.acquire()
KeyboardInterrupt

看起来这和多进程库有关吗?

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2018-06-11 18:52:04

我可以在我的MacBook Pro上复制错误。

这里的问题是关于tensorflow会议。如果会话是在GridSearchCV.fit()之前在父进程中创建的,它肯定会挂起。

一种可能的解决方案是将所有会话创建代码限制在KerasClassifer类和模型创建函数中。

此外,您可能希望在模型创建函数或KerasClassifier的子类中限制TF的内存使用。

快速解决方案:

代码语言:javascript
复制
n_jobs = 1

但这需要很长时间才能完成。

参考资料:

python多处理中的会话挂起问题

Keras + Tensorflow和Python中的多处理

限制tensorflow后端的资源使用

GridSearchCV挂起第二次运行

工作!=一次冰冻

jobs >1 Ask

票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/50803502

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档