首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >PyCUDA认为53*53 == 2808

PyCUDA认为53*53 == 2808
EN

Stack Overflow用户
提问于 2019-10-31 17:50:36
回答 1查看 85关注 0票数 0

我尝试使用pycuda计算53 * 53,如下所示:

代码语言:javascript
复制
import numpy as np
import pycuda.gpuarray as gpuarray
import pycuda.autoinit

a = gpuarray.to_gpu(np.array([53]))
print((a**2).get()[0])

它打印出2808,而真正的答案是2809。我哪里出错了?

EN

回答 1

Stack Overflow用户

发布于 2019-11-19 11:14:34

打印出2808,而真正的答案是2809

不,它没有:

代码语言:javascript
复制
$ cat ohnoitdoesnt.py 
import numpy as np
import pycuda.gpuarray as gpuarray
import pycuda.autoinit

a = gpuarray.to_gpu(np.array([53]))
print((a**2).get()[0])

$ python ohnoitdoesnt.py 
Traceback (most recent call last):
  File "ohnoitdoesnt.py", line 6, in <module>
    print((a**2).get()[0])
  File "/usr/local/lib/python2.7/dist-packages/pycuda-2017.1.1-py2.7-linux-x86_64.egg/pycuda/gpuarray.py", line 659, in __pow__
    return self._pow(other,new=True)
  File "/usr/local/lib/python2.7/dist-packages/pycuda-2017.1.1-py2.7-linux-x86_64.egg/pycuda/gpuarray.py", line 643, in _pow
    func = elementwise.get_pow_kernel(self.dtype)
  File "<string>", line 2, in get_pow_kernel
  File "/usr/local/lib/python2.7/dist-packages/pycuda-2017.1.1-py2.7-linux-x86_64.egg/pycuda/tools.py", line 430, in context_dependent_memoize
    result = func(*args)
  File "/usr/local/lib/python2.7/dist-packages/pycuda-2017.1.1-py2.7-linux-x86_64.egg/pycuda/elementwise.py", line 559, in get_pow_kernel
    "pow_method")
  File "/usr/local/lib/python2.7/dist-packages/pycuda-2017.1.1-py2.7-linux-x86_64.egg/pycuda/elementwise.py", line 161, in get_elwise_kernel
    arguments, operation, name, keep, options, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/pycuda-2017.1.1-py2.7-linux-x86_64.egg/pycuda/elementwise.py", line 147, in get_elwise_kernel_and_types
    keep, options, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/pycuda-2017.1.1-py2.7-linux-x86_64.egg/pycuda/elementwise.py", line 75, in get_elwise_module
    options=options, keep=keep)
  File "/usr/local/lib/python2.7/dist-packages/pycuda-2017.1.1-py2.7-linux-x86_64.egg/pycuda/compiler.py", line 291, in __init__
    arch, code, cache_dir, include_dirs)
  File "/usr/local/lib/python2.7/dist-packages/pycuda-2017.1.1-py2.7-linux-x86_64.egg/pycuda/compiler.py", line 255, in compile
    return compile_plain(source, options, keep, nvcc, cache_dir, target)
  File "/usr/local/lib/python2.7/dist-packages/pycuda-2017.1.1-py2.7-linux-x86_64.egg/pycuda/compiler.py", line 137, in compile_plain
    stderr=stderr.decode("utf-8", "replace"))
pycuda.driver.CompileError: nvcc compilation of /tmp/tmpaeIBGe/kernel.cu failed
[command: nvcc --cubin -arch sm_52 -I/usr/local/lib/python2.7/dist-packages/pycuda-2017.1.1-py2.7-linux-x86_64.egg/pycuda/cuda kernel.cu]
[stderr:
kernel.cu(19): error: calling a __host__ function("std::pow<long, long> ") from a __global__ function("pow_method") is not allowed

kernel.cu(19): error: identifier "std::pow<long, long> " is undefined in device code

2 errors detected in the compilation of "/tmp/tmpxft_00001674_00000000-6_kernel.cpp1.ii".
]

这不是一个unknown problem in CUDA and PyCUDA -- CUDA数学库不会重载大多数函数的整数参数版本。

如果我们修复了这个问题并使用了浮点类型,它就会像预期的那样工作:

代码语言:javascript
复制
$ cat ohnoitdoesnt.py 
import numpy as np
import pycuda.gpuarray as gpuarray
import pycuda.autoinit

a = gpuarray.to_gpu(np.array([53], dtype=np.float32))
print((a**2).get()[0])

$ python ohnoitdoesnt.py 
2809.0
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/58648841

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档