首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >cuML RandomForestClassifier: CUDA错误与文档示例

cuML RandomForestClassifier: CUDA错误与文档示例
EN

Stack Overflow用户
提问于 2021-06-03 16:47:17
回答 1查看 379关注 0票数 0

我试图在木星笔记本上运行这个例子,这个例子找到了这里,并复制了下一篇cuML关于分类的介绍--它在6000以下的n_samples中运行良好(这个参数指示生成的数据集的行数)

代码语言:javascript
复制
import cuml
from cuml.datasets.classification import make_classification
from cuml.preprocessing.model_selection import train_test_split
from cuml.ensemble import RandomForestClassifier as cuRF
from sklearn.metrics import accuracy_score
from cupy import asnumpy

# synthetic dataset dimensions
n_samples = 1000
n_features = 10
n_classes = 2

# random forest depth and size
n_estimators = 25
max_depth = 10

# generate synthetic data [ binary classification task ]
X, y = make_classification ( n_classes = n_classes,
                             n_features = n_features,
                             n_samples = n_samples,
                             random_state = 0 )

X_train, X_test, y_train, y_test = train_test_split( X, y, random_state = 0 )

model = cuRF( max_depth = max_depth,
              n_estimators = n_estimators,
              random_state  = 0 )

%time trained_RF = model.fit ( X_train, y_train )

predictions = model.predict ( X_test )

cu_score = cuml.metrics.accuracy_score( y_test, predictions )
sk_score = accuracy_score( asnumpy( y_test ), asnumpy( predictions ) )

在6000以上,我得到了以下CUDA错误和内核崩溃。请注意:

  • 将n_features从10增加到5000,而n_samples = 5000运行得非常好。因此,这似乎是数据集的行数,而不是列数的问题。
  • 在机器上可用的2个GPU上进行测试(GTX 1050 2GB)
  • nvidia-smi显示在运行过程中GPU内存使用率低于25%。
  • 库达v11.2
  • 驱动程序版本: 460.73.01
  • ubuntu 18

任何帮助都是非常感谢的。

数据自动化系统错误:

~/anaconda3/envs/rapids/lib/python3.8/site-packages/cuml/internals/api_decorators.py中的RuntimeError跟踪(最近一次调用)在inner_with_setters(*args,**kwargs) 408 target_val=target_val) 409 -> 410返回函数(*args,**kwargs) 411 412 @ cuml/ensemble/randomforestclassifier.pyx in cuml.ensemble.randomforestclassifier.RandomForestClassifier.fit() RuntimeError: file=/opt/conda/envs/rapids/conda-bld/libcuml_1614210250760/work/cpp/src/decisiontree/quantile/quantile.cuh line=150: call='cub::DeviceRadixSort::SortKeys( (void *)d_temp_RuntimeError->data(),temp_storage_bytes,&d_keys_inbatch_offset,D_key_out->data(),n_sampled_rows,0,8* sizeof(T),tempmem>stream)‘,在/home/oleg/anaconda3/envs/rapids/lib/python3.8/site-packages/cuml/common/../../../../libcuml++.so(_ZN4raft9exception18collect_call_stackEv+0x46) 0x7fa9b83eef36 #1中,在/home/oleg/anaconda3/envs/rapids/lib/python3.8/site-packages/cuml/common/../../../../libcuml++.so(_ZN4raft10cuda_errorC1ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x69) 0x7fa9b83ef699 #2中获得了64个堆栈帧#0。/home/oleg/anaconda3/envs/rapids/lib/python3.8/site-packages/cuml/common/../../../../libcuml++.so(_ZN2ML12DecisionTree19preprocess_quantileIfiEEvPKT_PKjiiiiSt10shared_ptrI15TemporaryMemoryIS2_T0_EE+0xaaf) 0x7fa9b84Fe7f #3 in /home/oleg/anaconda3/envs/rapids/lib/python3.8/site-packages/cuml/common/../../../../libcuml++.so(_ZN2ML12rfClassifierIfE3fitERKN4raft8handle_tEPKfiiPiiRPNS_20RandomForestMetaDataIfiEE+0xde3) 0x7fa9b8734b63 #4 in /home/oleg/anaconda3 3/envs/急流/lib/python3.8/site-packages/cuml/common/../../../../libcuml++.so(_ZN2ML3fitERKN4raft8handle_tERPNS_20RandomForestMetaDataIfiEEPfiiPiiNS_9RF_paramsEi+0x1fd) 0x7fa9b872f54d #5 in /home/oleg/anaconda3/envs/rapids/lib/python3.8/site-packages/cuml/ensemble/randomforestclassifier.cpython-38-x86_64-linux-gnu.so(+0x3c7e5) 0x7fa98e6d97e5 #6 /home/oleg/anaconda3/envs/rapids/bin/python(PyObject_Call+0x255) 0x5589964052b5 #7 in /home/oleg/anaconda3/envs/rapids/bin/python(_PyEval_EvalFrameDefault+0x21c1) 0x5589964b1de1 #8 /home/oleg/anaconda3/envs/rapids/bin/python(_PyEval_EvalCodeWithName+0x2c3) 0x558996490503 #9 /home/oleg/anaconda3/envs/rapids/bin/python(+0x1b2007) 0x558996492007 #10 /home/oleg/anaconda3/envs/rapids/bin/python(_PyEval_EvalFrameDefault+0x4ca3) 0x5589964b48c3 #11 /home/oleg/anaconda3/envs/rapids/bin/python(_PyEval_EvalCodeWithName+0x2c3) 0x558996490503 #12 /home/oleg/anaconda3 3/envs/rapids/bin/python(PyEval_EvalCodeEx+0x39) 0x558996491559 #13 /home/oleg/anaconda3/envs/rapids/bin/python(PyEval_EvalCode+0x1b) 0x5589965349ab #14 /home/oleg/anaconda3/envs/rapids/bin/python(+0x2731de) 0x5589965531de #15 /home/oleg/anaconda3/envs/rapids/bin/python(+0x128d4b) 0x558996408d4b #16删除/home/oleg/anaconda3/envs/rapids/bin/python(_PyEval_EvalCodeWithName+0x2c3) 0x558996490503 #55,/home/oleg/anaconda3/envs/rapids/bin/python(+0x1b2007) 0x558996492007 #56,/home/oleg/anaconda3/envs/rapids/bin/python(_PyEval_EvalFrameDefault+0x1782) 0x5589964b13a2 #57,/home/oleg/anaconda3/envs/rapids/bin/python(+0x1925da) 0x5589964725da #58,/home/oleg/anaconda3/envs/rapids/bin/python(+0x128d4b) 0x558996408d4b #59,/home/oleg/anaconda3/envs/rapids/bin/python(+0x13b3ea) 0x55899641b3ea #60 /home/oleg/anaconda3/envs/rapids/bin/python(+0x21da4f) 0x5589964fda4f #61 /home/oleg/anaconda3/envs/rapids/bin/python(+0x128fc2) 0x558996408fc2 #62 /home/oleg/anaconda3/envs/rapids/bin/python(_PyEval_EvalFrameDefault+0x92f) 0x5589964b054f #63 /home/oleg/anaconda3/envs/rapids/bin/python(_PyEval_EvalCodeWithName+0x2c3) 0x558996490503

EN

回答 1

Stack Overflow用户

发布于 2021-06-10 17:37:53

发现这个问题与在cuML中使用RF的实验后端有关,因此在cuRF配置中设置split_algo =0可以通过返回默认后端来解决问题。在编写本报告时,这比使用实验后端慢3倍。

票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/67825532

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档