我正在尝试使用CUFFT来计算图像的fft。似乎CUFFT只提供与cudaMalloc一起分配的普通设备指针的快速傅立叶变换。
我的输入图像是使用cudaMallocPitch分配的,但没有处理图像指针的间距的选项。
目前,我必须删除行对齐,然后执行fft,并将结果复制回倾斜的指针。我当前的代码如下:
void fft_device(float* src, cufftComplex* dst, int width, int height, int srcPitch, int dstPitch)
{
//src and dst are device pointers allocated with cudaMallocPitch
//Convert them to plain pointers. No padding of rows.
float *plainSrc;
cufftComplex *plainDst;
cudaMalloc<float>(&plainSrc,width * height * sizeof(float));
cudaMalloc<cufftComplex>(&plainDst,width * height * sizeof(cufftComplex));
cudaMemcpy2D(plainSrc,width * sizeof(float),src,srcPitch,width * sizeof(float),height,cudaMemcpyDeviceToDevice);
cufftHandle handle;
cufftPlan2d(&handle,width,height,CUFFT_R2C);
cufftSetCompatibilityMode(handle,CUFFT_COMPATIBILITY_NATIVE);
cufftExecR2C(handle,plainSrc,plainDst);
cufftDestroy(handle);
cudaMemcpy2D(dst,dstPitch,plainDst,width * sizeof(cufftComplex),width * sizeof(cufftComplex),height,cudaMemcpyDeviceToDevice);
cudaFree(plainSrc);
cudaFree(plainDst);
} 它给出了正确的结果,但我不想在函数中做2次额外的内存分配和复制。我想做这样的事情:
void fft_device(float* src, cufftComplex* dst, int width, int height, int srcPitch, int dstPitch)
{
//src and dst are device pointers allocated with cudaMallocPitch
//Don't know how to handle pitch here???
cufftHandle handle;
cufftPlan2d(&handle,width,height,CUFFT_R2C);
cufftSetCompatibilityMode(handle,CUFFT_COMPATIBILITY_NATIVE);
cufftExecR2C(handle,src,dst);
cufftDestroy(handle);
}问题:
如何使用CUFFT直接计算倾斜指针的fft?
发布于 2012-12-27 17:47:48
我想你可能会对cufftPlanMany感兴趣,它可以让你用音高做一维、二维和三维的ffts。这里的关键是inembed和onembed参数。
你可以查阅CUDA_CUFFT_Users_Guide.pdf (第23-24页)了解更多信息。但对于您的示例,您将执行类似以下内容的操作。
void fft_device(float* src, cufftComplex* dst,
int width, int height,
int srcPitch, int dstPitch)
{
cufftHandle handle;
int rank = 2; // 2D fft
int n[] = {width, height}; // Size of the Fourier transform
int istride = 1, ostride = 1; // Stride lengths
int idist = 1, odist = 1; // Distance between batches
int inembed[] = {srcPitch, height}; // Input size with pitch
int onembed[] = {dstPitch, height}; // Output size with pitch
int batch = 1;
cufftPlanMany(&handle, rank, n,
inembed, istride, idist,
onembed, ostride, odist, CUFFT_R2C, batch);
cufftSetCompatibilityMode(handle,CUFFT_COMPATIBILITY_NATIVE);
cufftExecR2C(handle,src,dst);
cufftDestroy(handle);
}附注:我没有在这里添加退货检查,只是为了举例。始终检查代码中的返回值。
https://stackoverflow.com/questions/14026900
复制相似问题