首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >Tensorflow GPU CUDA无法加载动态库'libcufft.so.10';

Tensorflow GPU CUDA无法加载动态库'libcufft.so.10';
EN

Stack Overflow用户
提问于 2021-09-23 13:18:30
回答 1查看 846关注 0票数 0

我担心这会被标记为重复,但是我找到了libcudartlibcublas的例子,但没有libcufft (这是我的问题)。

我安装了TensorFlow,我想使用GPU。因此,我在链接上运行脚本。

在运行TensorFlow来训练网络时,我会收到以下消息:

代码语言:javascript
复制
2021-09-23 11:19:22.158959: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-09-23 11:19:22.162563: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcufft.so.10'; dlerror: libcufft.so.10: cannot open shared object file: No such file or directory
2021-09-23 11:19:22.162651: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcurand.so.10'; dlerror: libcurand.so.10: cannot open shared object file: No such file or directory
2021-09-23 11:19:22.162730: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcusolver.so.11'; dlerror: libcusolver.so.11: cannot open shared object file: No such file or directory
2021-09-23 11:19:22.162806: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcusparse.so.11'; dlerror: libcusparse.so.11: cannot open shared object file: No such file or directory
2021-09-23 11:19:22.162989: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1835] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
2021-09-23 11:19:22.163345: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.

使用tf.config.list_physical_devices()我得到:

代码语言:javascript
复制
2021-09-23 11:30:18.327648: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-09-23 11:30:18.329447: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcufft.so.10'; dlerror: libcufft.so.10: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: :/usr/local/cuda/extras/CUPTI/lib64
2021-09-23 11:30:18.329510: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcurand.so.10'; dlerror: libcurand.so.10: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: :/usr/local/cuda/extras/CUPTI/lib64
2021-09-23 11:30:18.329573: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcusolver.so.11'; dlerror: libcusolver.so.11: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: :/usr/local/cuda/extras/CUPTI/lib64
2021-09-23 11:30:18.329687: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcusparse.so.11'; dlerror: libcusparse.so.11: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: :/usr/local/cuda/extras/CUPTI/lib64
2021-09-23 11:30:18.329814: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1835] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
[PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU')]

我有一个名为/usr/local/cuda-11.0的文件夹,但不仅仅是cuda,也没有一个extras文件夹。的确,它说的是Ubuntu18.04,我有Ubuntu20.04。

如果我试图按照建议的sudo apt install nvidia-cuda-toolkit运行这里,就会得到:

代码语言:javascript
复制
Reading package lists... Done
Building dependency tree       
Reading state information... Done
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:

The following packages have unmet dependencies:
 nvidia-cuda-toolkit : Depends: nvidia-cuda-dev (= 10.1.243-3) but it is not going to be installed
                       Recommends: nsight-compute (= 10.1.243-3)
                       Recommends: nsight-systems (= 10.1.243-3)
E: Unable to correct problems, you have held broken packages.

whereis cuda的输出为cuda: (空)。

nvidia-smi输出

代码语言:javascript
复制
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.91.03    Driver Version: 460.91.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GeForce RTX 208...  Off  | 00000000:01:00.0  On |                  N/A |
|  0%   40C    P8    31W / 300W |    626MiB / 11016MiB |     15%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      1141      G   /usr/lib/xorg/Xorg                 59MiB |
|    0   N/A  N/A      1749      G   /usr/lib/xorg/Xorg                315MiB |
|    0   N/A  N/A      1886      G   /usr/bin/gnome-shell               59MiB |
|    0   N/A  N/A      1907      G   ...mviewer/tv_bin/TeamViewer        2MiB |
|    0   N/A  N/A      2463      G   ...ble-features=SpareRendere        4MiB |
|    0   N/A  N/A      3825      G   ...AAAAAAAAA= --shared-files      105MiB |
|    0   N/A  N/A      4682      G   .../debug.log --shared-files       36MiB |
|    0   N/A  N/A     20600      G   ...AAAAAAAAA= --shared-files       24MiB |
+-----------------------------------------------------------------------------+

我害怕安装解决它的东西,并结束了典型的20个版本的CUDA相互碰撞。

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2021-09-23 14:39:17

因此,我按照评论中的建议,以一种非常积极的方式卸载了所有的东西:

代码语言:javascript
复制
sudo apt clean
sudo apt update
sudo apt purge cuda
sudo apt purge nvidia-* 
sudo apt autoremove

然后,我按照说明安装:

  • 库达
  • CUDA工具包 (虽然我认为它是一样的,但我只是添加了一个命令sudo apt-get install nvidia-gds,我甚至不知道它是否必要)
  • CUDNN

现在看来起作用了。

票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/69300826

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档