首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >TensorFlow仅适用于GPU0

TensorFlow仅适用于GPU0
EN

Stack Overflow用户
提问于 2016-03-12 07:05:20
回答 1查看 3.4K关注 0票数 1

我很难尝试在GPU1中运行tensorflow程序。无论我是使用CUDA_VISIBLE_DEVICES=1 python program.py还是在程序中使用tf.device('/gpu:1'),我总是收到以下错误:

代码语言:javascript
复制
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:900] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 0 with properties: 
name: Tesla K40c
major: 3 minor: 5 memoryClockRate (GHz) 0.745
pciBusID 0000:04:00.0
Total memory: 12.00GiB
Free memory: 11.90GiB
I tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0 
I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0:   Y 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:717] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Tesla K40c, pci bus id: 0000:04:00.0)
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 1.0KiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 2.0KiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 4.0KiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 8.0KiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 16.0KiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 32.0KiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 64.0KiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 128.0KiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 256.0KiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 512.0KiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 1.00MiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 2.00MiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 4.00MiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 8.00MiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 16.00MiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 32.00MiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 64.00MiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 128.00MiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 256.00MiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 512.00MiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 1.00GiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 2.00GiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 4.00GiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 8.00GiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 16.00GiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:73] Allocating 11.31GiB bytes.
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:83] GPU 0 memory begins at 0x9047a0000 extends to 0xbd8325334
F tensorflow/stream_executor/cuda/cuda_driver.cc:383] Check failed: CUDA_SUCCESS == dynload::cuCtxSetCurrent(context) (0 vs. 216)
F tensorflow/stream_executor/cuda/cuda_driver.cc:383] Check failed: CUDA_SUCCESS == dynload::cuCtxSetCurrent(context) (0 vs. 216)
F tensorflow/stream_executor/cuda/cuda_driver.cc:383] Check failed: CUDA_SUCCESS == dynload::cuCtxSetCurrent(context) (0 vs. 216)
Aborted

在使用GPU0运行时不会发生这种情况,但问题是其他人正在使用该GPU,而GPU1处于空闲状态

EN

回答 1

Stack Overflow用户

发布于 2016-03-12 22:55:37

如果您使用的是CUDA_VISIBLE_DEVICES=1,请确保“其他人”正在使用CUDA_VISIBLE_DEVICES=0运行。

默认情况下,如果未指定CUDA_VISIBLE_DEVICES,则tensorflow进程会占用所有GPU,即使它并未主动使用这些GPU。它们可能显示为空闲,但后续tensorflow进程将无法访问它们。但是,您可以在每个进程上使用CUDA_VISIBLE_DEVICES为每个进程分配不同的GPU。

请记住,在每个进程中,tensorflow设备/gpu:0、/gpu:1等都是与进程相关的,而不是全局的,它们将引用该process...put可用的任何cuda可见设备。如果您希望每个进程使用1个GPU,您可以在代码中引用所有这些GPU的/gpu:0,同时使用CUDA_VISIBLE_DEVICES为每个进程分配特定的GPU。

使用命令nvidia-smi查看哪些进程正在使用哪些GPU。

票数 3
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/35951487

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档