首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >导入后运行时出现hdbscan并行错误

导入后运行时出现hdbscan并行错误
EN

Stack Overflow用户
提问于 2018-02-06 21:24:28
回答 1查看 353关注 0票数 1

我正在我的数据上构建和拟合一个hdbscan模型,当我从文件内部运行脚本时,它工作得又好又快,但是当我导入文件并从‘外部’运行它时,它进入了一个奇怪的循环,我不知道它是如何开始的。我得到了以下错误:

代码语言:javascript
复制
ImportError: [joblib] Attempting to do parallel computing without protecting your import on a system that does not support forking. To use parallel-computing in a script, you must protect your main loop using "if __name__ == '__main__'". Please see the joblib documentation on Parallel for more information

以下是代码的摘录:

代码语言:javascript
复制
df_pos_raw, df_pos_training = pre_process_data(df_pos)
df_pos_training_std = standardize_df(df_pos_training)  # Standardized data, column-wise
print "generating model"
pos_cls = hdbscan.HDBSCAN(min_cluster_size=10, prediction_data=True)
print "fitting model to data"
pos_cls.fit(df_pos_training_std)
print 'done fitting model'
# sns.distplot(pos_cls.labels_, bins=len(set(pos_cls.labels_)))
df_filtered = filter_cons_types(df, [3, 5])
print "Done. returning variables"
return pos_cls, df_filtered

以下是从“外部”文件运行时的输出:

代码语言:javascript
复制
Traceback (most recent call last):
File "<string>", line 1, in <module>
generating model
File "C:\ProgramData\Anaconda2\Lib\multiprocessing\forking.py", line 380, in main
fitting model to data
prepare(preparation_data)
File "C:\ProgramData\Anaconda2\Lib\multiprocessing\forking.py", line 510, in prepare
'__parents_main__', file, path_name, etc
File "C:\Users\sareetn\PycharmProjects\Arad\DataImputation\ClusteringExtrapolation\Dev\run_clustering_based_prediction.py", line 4, in <module>
model, raw_df = clustering()
File "C:\Users\sareetn\PycharmProjects\Arad\DataImputation\ClusteringExtrapolation\Dev\clustering_model_constype_3_5.py", line 86, in main
pos_cls.fit(df_pos_training_std)
File "C:\Users\sareetn\PycharmProjects\Arad\venv\lib\site-packages\hdbscan\hdbscan_.py", line 816, in fit
self._min_spanning_tree) = hdbscan(X, **kwargs)
File "C:\Users\sareetn\PycharmProjects\Arad\venv\lib\site-packages\hdbscan\hdbscan_.py", line 543, in hdbscan
core_dist_n_jobs, **kwargs)
File "C:\Users\sareetn\PycharmProjects\Arad\venv\lib\site-packages\sklearn\externals\joblib\memory.py", line 362, in __call__
return self.func(*args, **kwargs)
File "C:\Users\sareetn\PycharmProjects\Arad\venv\lib\site-packages\hdbscan\hdbscan_.py", line 239, in _hdbscan_boruvka_kdtree
n_jobs=core_dist_n_jobs, **kwargs)
File "hdbscan/_hdbscan_boruvka.pyx", line 375, in hdbscan._hdbscan_boruvka.KDTreeBoruvkaAlgorithm.__init__ (hdbscan/_hdbscan_boruvka.c:5195)
File "hdbscan/_hdbscan_boruvka.pyx", line 411, in hdbscan._hdbscan_boruvka.KDTreeBoruvkaAlgorithm._compute_bounds (hdbscan/_hdbscan_boruvka.c:5915)
File "C:\Users\sareetn\PycharmProjects\Arad\venv\lib\site-packages\sklearn\externals\joblib\parallel.py", line 749, in __call__
n_jobs = self._initialize_backend()
File "C:\Users\sareetn\PycharmProjects\Arad\venv\lib\site-packages\sklearn\externals\joblib\parallel.py", line 547, in _initialize_backend
**self._backend_args)
File "C:\Users\sareetn\PycharmProjects\Arad\venv\lib\site-packages\sklearn\externals\joblib\_parallel_backends.py", line 305, in configure
'[joblib] Attempting to do parallel computing '
ImportError: [joblib] Attempting to do parallel computing without protecting your import on a system that does not support forking. To use parallel-computing in a script, you must protect your main loop using "if __name__ == '__main__'". Please see the joblib documentation on Parallel for more information
generating model
fitting model to data
generating model
fitting model to data
generating model
fitting model to data

非常感谢你提前!

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2018-02-08 02:59:56

我的一个朋友帮我弄明白了-

集群使用一个名为joblib的库,该库将作业拆分为多个并行进程。在Windows计算机上运行这些函数时,需要注意确保使用

代码语言:javascript
复制
if __name__ == '__main__'

为了保护代码并允许并行处理工作。添加后

代码语言:javascript
复制
if __name__ == '__main__' 

将所有代码放在那里,集群运行得又快又流畅

票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/48644137

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档