我有以下一些琐碎的要点::from程序(直接摘自thrust::from文档)
#include <thrust/gather.h>
#include <thrust/device_vector.h>
int main(void)
{
// mark even indices with a 1; odd indices with a 0
int values[10] = {1, 0, 1, 0, 1, 0, 1, 0, 1, 0};
thrust::device_vector<int> d_values(values, values + 10);
// gather all even indices into the first half of the range
// and odd indices to the last half of the range
int map[10] = {0, 2, 4, 6, 8, 1, 3, 5, 7, 9};
thrust::device_vector<int> d_map(map, map + 10);
thrust::device_vector<int> d_output(10);
thrust::gather(d_map.begin(), d_map.end(),
d_values.begin(),
d_output.begin());
// d_output is now {1, 1, 1, 1, 1, 0, 0, 0, 0, 0}
return 0;
}我用
/usr/local/cuda/bin/nvcc -ccbin g++ -I../../common/inc -m64 -g -G -gencode arch=compute_30,code=sm_30 -o thrustGather.o -c thrustGather.cu
/usr/local/cuda/bin/nvcc -ccbin g++ -m64 -g -G -o thrustGather thrustGather.o接下来,我尝试在第一次将它附加到cuda之后运行这个简单的程序:
>cuda-gdb ./thrustGather
NVIDIA (R) CUDA Debugger
5.5 release
Portions Copyright (C) 2007-2013 NVIDIA Corporation
GNU gdb (GDB) 7.2
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/local/cuda-5.5/samples/0_Simple/thrustGatherRjm/thrustGather...done.
(cuda-gdb) run
Starting program: /usr/local/cuda-5.5/samples/0_Simple/thrustGatherRjm/thrustGather
[Thread debugging using libthread_db enabled]
[New Thread 0x7ffff7272700 (LWP 50318)]
[Context Create of context 0x78d790 on Device 0]
[Launch of CUDA Kernel 0 (memset32_aligned1D<<<(1,1,1),(128,1,1)>>>) on Device 0]
[Launch of CUDA Kernel 1 (memset32_aligned1D<<<(1,1,1),(128,1,1)>>>) on Device 0]
[Launch of CUDA Kernel 2 (memset32_aligned1D<<<(1,1,1),(128,1,1)>>>) on Device 0]
[Launch of CUDA Kernel 3 (memset32_aligned1D<<<(1,1,1),(128,1,1)>>>) on Device 0]
[Launch of CUDA Kernel 4 (memset32_aligned1D<<<(1,1,1),(128,1,1)>>>) on Device 0]
[Launch of CUDA Kernel 5 (memset32_aligned1D<<<(1,1,1),(128,1,1)>>>) on Device 0]
[Launch of CUDA Kernel 6 (memset32_aligned1D<<<(1,1,1),(128,1,1)>>>) on Device 0]
[Launch of CUDA Kernel 7 (memset32_aligned1D<<<(1,1,1),(128,1,1)>>>) on Device 0]
Error: received unexpected signal: Segmentation fault
BACKTRACE (41 frames):
cuda-gdb[0x4394e1]
/lib64/libc.so.6[0x3d96635a90]
cuda-gdb[0x5b038b]
cuda-gdb[0x55aae8]
cuda-gdb[0x55ed65]
cuda-gdb[0x55fc51]
cuda-gdb[0x55ec22]
cuda-gdb[0x5609fe]
cuda-gdb[0x5607bd]
cuda-gdb[0x560c36]
cuda-gdb[0x4f7e44]
cuda-gdb[0x4f8038]
cuda-gdb[0x4fde3c]
cuda-gdb[0x5c9f66]
cuda-gdb[0x429c3c]
cuda-gdb[0x5ca4e5]
cuda-gdb[0x5cab5e]
cuda-gdb[0x4296e6]
cuda-gdb[0x479366]
cuda-gdb[0x53addd]
cuda-gdb[0x5129c0]
cuda-gdb[0x5134fd]
cuda-gdb[0x51369d]
cuda-gdb[0x5091e7]
cuda-gdb[0x40f65d]
cuda-gdb[0x522f54]
cuda-gdb[0x523a20]
cuda-gdb[0x5ff9aa]
cuda-gdb[0x522fb9]
cuda-gdb[0x521b81]
cuda-gdb[0x522b1e]
cuda-gdb[0x51d0cb]
cuda-gdb[0x4ae816]
cuda-gdb[0x406429]
cuda-gdb[0x51d0cb]
cuda-gdb[0x406b76]
cuda-gdb[0x51d0cb]
cuda-gdb[0x406204]
cuda-gdb[0x4061d6]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x3d96621b75]
cuda-gdb[0x4060e9]
[Termination of CUDA Kernel 7 (memset32_aligned1D<<<(1,1,1),(128,1,1)>>>) on Device 0]
[Termination of CUDA Kernel 6 (memset32_aligned1D<<<(1,1,1),(128,1,1)>>>) on Device 0]
[Termination of CUDA Kernel 5 (memset32_aligned1D<<<(1,1,1),(128,1,1)>>>) on Device 0]
[Termination of CUDA Kernel 4 (memset32_aligned1D<<<(1,1,1),(128,1,1)>>>) on Device 0]
[Termination of CUDA Kernel 3 (memset32_aligned1D<<<(1,1,1),(128,1,1)>>>) on Device 0]
[Termination of CUDA Kernel 2 (memset32_aligned1D<<<(1,1,1),(128,1,1)>>>) on Device 0]
[Termination of CUDA Kernel 1 (memset32_aligned1D<<<(1,1,1),(128,1,1)>>>) on Device 0]
[Termination of CUDA Kernel 0 (memset32_aligned1D<<<(1,1,1),(128,1,1)>>>) on Device 0]请注意,cuda本身是分段错误。我还做了相应的扩展
其中,只有最后三个(inclusive_scan、sort_by_key、reduce_by_key)工作(即,不要使库达-gdb崩溃)。
这肯定是推力和/或cuda的最新版本(5.5)的问题,因为我在5.0版中运行了相同的测试,没有任何问题。
以下是有关我的设置的一些信息:
> cat /proc/driver/nvidia/version
NVRM version: NVIDIA UNIX x86_64 Kernel Module 319.21 Sat May 11 23:51:00 PDT 2013
GCC version: gcc version 4.8.1 20130603 (Red Hat 4.8.1-1) (GCC)
> cat /proc/version
Linux version 3.9.9-302.fc19.x86_64 (mockbuild@bkernel01.phx2.fedoraproject.org) (gcc version 4.8.1 20130603 (Red Hat 4.8.1-1) (GCC) ) #1 SMP Sat Jul 6 13:41:07 UTC 2013
> gcc --version
gcc (GCC) 4.8.1 20130603 (Red Hat 4.8.1-1)
Copyright (C) 2013 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
> lspci | grep NVIDIA
05:00.0 3D controller: NVIDIA Corporation GK104 [GeForce GTX 690] (rev a1)
05:00.1 Audio device: NVIDIA Corporation GK104 HDMI Audio Controller (rev a1)
06:00.0 VGA compatible controller: NVIDIA Corporation GK104 [GeForce GTX 690] (rev a1)
06:00.1 Audio device: NVIDIA Corporation GK104 HDMI Audio Controller (rev a1)发布于 2013-07-18 14:22:21
正如talonmies所指出的,问题是在使用调试构建时,推力库不能正确运行。在我的应用程序中,我有一个相当复杂的.cu文件,其中包含我自己的几个CUDA内核,以及多个推力调用。如果我用-g -G调试标志编译这个文件并在cuda中运行,它就会崩溃--使我无法调试内核。
因为我不关心调试推力调用本身(只有我的内核),所以我的解决方案包括将所有的推力调用放到另一个文件thrustWrappers.cu中,并且不需要调试就编译这个文件。然后,在我的主.cu文件中,我将用关联的包装器函数(在thrustWrappers中定义)替换对推力的调用。例如,
thrust::reduce(...)变成了
thrust::reduce_wrapper(...)然后,我将两个产生的对象文件链接在一起。
https://stackoverflow.com/questions/17703990
复制相似问题