首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >利用NVIDIA的cuSolver库分割Pycuda故障

利用NVIDIA的cuSolver库分割Pycuda故障
EN

Stack Overflow用户
提问于 2015-04-21 15:11:26
回答 1查看 576关注 0票数 0

我正在尝试制作一个受科学资料袋-库达库启发的pycuda包装器,对于新的cuSolver库中提供的一些操作,首先我需要通过cusolverDnSgetrf() op进行LU分解。但在此之前,我需要'Workspace‘参数,这是cuSolver为获取它提供的工具,名为cusolverDnSgetrf_bufferSize();但是当我使用它时,只需要崩溃并返回分段错误。我做错什么了?

注意:我已经使用了这个运行程序,但是cuSolver库使用了很多这样的论点,并且我想比较一下scikits cuda和我的实现与新库之间的用法。

代码语言:javascript
复制
import numpy as np
import pycuda.gpuarray
import ctypes
import ctypes.util

libcusolver = ctypes.cdll.LoadLibrary('libcusolver.so')

class _types:
  handle = ctypes.c_void_p

libcusolver.cusolverDnCreate.restype = int
libcusolver.cusolverDnCreate.argtypes = [_types.handle]

def cusolverCreate():
    handle = _types.handle()
    libcusolver.cusolverDnCreate(ctypes.byref(handle))
    return handle.value

libcusolver.cusolverDnDestroy.restype = int
libcusolver.cusolverDnDestroy.argtypes = [_types.handle]

def cusolverDestroy(handle):
    libcusolver.cusolverDnDestroy(handle)


libcusolver.cusolverDnSgetrf_bufferSize.restype = int
libcusolver.cusolverDnSgetrf_bufferSize.argtypes =[_types.handle,
                                       ctypes.c_int,
                                       ctypes.c_int,
                                       ctypes.c_void_p,
                                       ctypes.c_int,
                                       ctypes.c_void_p]

def cusolverLUFactorization(handle, matrix):
    m,n=matrix.shape
    mtx_gpu = gpuarray.to_gpu(matrix.astype('float32'))
    work=gpuarray.zeros(1, np.float32)
    status=libcusolver.cusolverDnSgetrf_bufferSize(
                          handle, m, n,
                          int(mtx_gpu.gpudata),
                          n, int(work.gpudata))
    print status


x = np.asarray(np.random.rand(3, 3), np.float32)
handle_solver=cusolverCreate()
cusolverLUFactorization(handle_solver,x)
cusolverDestroy(handle_solver)
EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2015-04-21 21:07:34

cusolverDnSgetrf_bufferSize的最后一个参数应该是一个常规指针,而不是GPU内存指针。尝试按以下方式修改cusolverLUFactorization()函数:

代码语言:javascript
复制
def cusolverLUFactorization(handle, matrix):
    m,n=matrix.shape
    mtx_gpu = gpuarray.to_gpu(matrix.astype('float32'))

    work = ctypes.c_int()
    status = libcusolver.cusolverDnSgetrf_bufferSize(
                         handle, m, n,
                         int(mtx_gpu.gpudata),
                         n, ctypes.pointer(work))
    print status
    print work.value

票数 2
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/29776229

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档