首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >Cuda矢量类型的推力支撑

Cuda矢量类型的推力支撑
EN

Stack Overflow用户
提问于 2018-07-10 22:27:14
回答 1查看 320关注 0票数 1

我目前正在尝试使用thrust::upper_bound函数。我遇到了一个问题,与我提供给函数的论点。我想使用CUDA向量类型,特别是double3,但是当我使用这种类型时,我得到了几个推力库错误。

我正在运行的代码块如下:

代码语言:javascript
复制
/********************************************************************************                                                                            
eos_search_gpu()                                                                                                                                           
purpose        --- kernel to find the upper bound index for the                                                                                            
                 interpolation values                                                                                                                    
arguments --                                                                                                                                               

y              --- input   double3 values for which we are searching                                                                                       
my             --- input   int number of values for which we are searching                                                                                 
x              --- input   double3 array of structs containin the data table                                                                               
                         values for x, y, and f corresponding to structs                                                                                 
                         ".x", ".y", and ".z"                                                                                                            
n              --- input   int number of data values in the table                                                                                          
dim_x          --- input   int number of data values in the x-direcion of table                                                                            
j[]            --- input/output    int[]  array of int'sthat contains                                                                                      
                 the index of the (x,y,f) position of the upper bound                                                                                    


library calls --                                                                                                                                           

  __host__ __device__ ForwardIterator  thrust::upper_bound(                                                                                                  
         const thrust::detail::execution_policy_base<DerivedPolicy>& exec,                                                                               
         ForwardIterator                                             first,                                                                              
         ForwardIterator                                             last,                                                                               
         const LessThanComparable &                                  value                                                                               
         )                                                                                                                                               

 exec         --- the execution policy to use for parallelization                                                                                        
 first        --- the beginning of the ordered sequence                                                                                                  
 last         --- the end of the ordered sequence                                                                                                        
 value        --- the value to be searched.                                                                                                              

 Returns:     the furthermost iterator i, such that value < *i is false                                                                                  


 const detail::seq_t thrust::seq                                                                                                                            
 an execution policy which requires analgorithm invocation to execute                                                                                    
 sequentially in the current thread.                                                                                                                     

 ********************************************************************************/

__global__ void eos_search_gpu(const double3* y, const int my,
                           const double3* x, const int n,
                           const int dim_x, int * j){

    int i = threadIdx.x + blockDim.x * blockIdx.x;
    if ( i < my) {
      const double ptr = thrust::upper_bound(thrust::seq, x[0].y , x[n-1].y, y[i].y);                                                                     
      j[i] = (ptr - x[i].y - 1);

    }
}

所显示的错误消息如下:

代码语言:javascript
复制
/opt/cudatoolkit/9.1/bin/../targets/x86_64-linux/include/thrust/iterator/iterator_traits.h(45): error: a class or namespace qualified name is required
      detected during:
        instantiation of class "thrust::iterator_traits<T> [with T=double]" 
 /opt/cudatoolkit/9.1/bin/../targets/x86_64-linux/include/thrust/iterator/detail/iterator_traits.inl(53): here
        instantiation of class "thrust::iterator_difference<Iterator> [with Iterator=double]" 
  /opt/cudatoolkit/9.1/bin/../targets/x86_64-linux/include/thrust/system/detail/sequential/binary_search.h(102): here
        instantiation of "ForwardIterator     thrust::system::detail::sequential::upper_bound(thrust::system::detail::sequential::execution_policy<DerivedPolicy> &, ForwardIterator, ForwardIterator, const T &, StrictWeakOrdering) [with DerivedPolicy=thrust::detail::seq_t, ForwardIterator=double, T=double, StrictWeakOrdering=thrust::system::detail::generic::detail::binary_search_less]" 
 /opt/cudatoolkit/9.1/bin/../targets/x86_64-linux/include/thrust/detail/binary_search.inl(83): here
        instantiation of "ForwardIterator thrust::upper_bound(const thrust::detail::execution_policy_base<DerivedPolicy> &, ForwardIterator, ForwardIterator, const T &, StrictWeakOrdering) [with DerivedPolicy=thrust::detail::seq_t, ForwardIterator=double, T=double, StrictWeakOrdering=thrust::system::detail::generic::detail::binary_search_less]" 
/opt/cudatoolkit/9.1/bin/../targets/x86_64-linux/include/thrust/system/detail/generic/binary_search.inl(225): here
        instantiation of "ForwardIterator thrust::system::detail::generic::upper_bound(thrust::execution_policy<DerivedPolicy> &, ForwardIterator, ForwardIterator, const T &) [with DerivedPolicy=thrust::detail::seq_t, ForwardIterator=double, T=double]" 
/opt/cudatoolkit/9.1/bin/../targets/x86_64-linux/include/thrust/detail/binary_search.inl(69): here
        instantiation of "ForwardIterator thrust::upper_bound(const thrust::detail::execution_policy_base<DerivedPolicy> &, ForwardIterator, ForwardIterator, const LessThanComparable &) [with DerivedPolicy=thrust::detail::seq_t, ForwardIterator=double, LessThanComparable=double]" 
Interpolation_cuda.cu(254): here

/opt/cudatoolkit/9.1/bin/../targets/x86_64-linux/include/thrust/iterator/iterator_traits.h(45): error: global-scope qualifier (leading "::") is not allowed
      detected during:
        instantiation of class "thrust::iterator_traits<T> [with T=double]" 
/opt/cudatoolkit/9.1/bin/../targets/x86_64-linux/include/thrust/iterator/detail/iterator_traits.inl(53): here
        instantiation of class "thrust::iterator_difference<Iterator> [with Iterator=double]" 
/opt/cudatoolkit/9.1/bin/../targets/x86_64-linux/include/thrust/system/detail/sequential/binary_search.h(102): here
        instantiation of "ForwardIterator thrust::system::detail::sequential::upper_bound(thrust::system::detail::sequential::execution_policy<DerivedPolicy> &, ForwardIterator, ForwardIterator, const T &, StrictWeakOrdering) [with DerivedPolicy=thrust::detail::seq_t, ForwardIterator=double, T=double, StrictWeakOrdering=thrust::system::detail::generic::detail::binary_search_less]" 
 /opt/cudatoolkit/9.1/bin/../targets/x86_64-linux/include/thrust/detail/binary_search.inl(83): here
        instantiation of "ForwardIterator thrust::upper_bound(const thrust::detail::execution_policy_base<DerivedPolicy> &, ForwardIterator, ForwardIterator, const T &, StrictWeakOrdering) [with DerivedPolicy=thrust::detail::seq_t, ForwardIterator=double, T=double, StrictWeakOrdering=thrust::system::detail::generic::detail::binary_search_less]" 
/opt/cudatoolkit/9.1/bin/../targets/x86_64-linux/include/thrust/system/detail/generic/binary_search.inl(225): here
        instantiation of "ForwardIterator thrust::system::detail::generic::upper_bound(thrust::execution_policy<DerivedPolicy> &, ForwardIterator, ForwardIterator, const T &) [with DerivedPolicy=thrust::detail::seq_t, ForwardIterator=double, T=double]" 
/opt/cudatoolkit/9.1/bin/../targets/x86_64-linux/include/thrust/detail/binary_search.inl(69): here
        instantiation of "ForwardIterator thrust::upper_bound(const thrust::detail::execution_policy_base<DerivedPolicy> &, ForwardIterator, ForwardIterator, const LessThanComparable &) [with DerivedPolicy=thrust::detail::seq_t, ForwardIterator=double, LessThanComparable=double]" 
Interpolation_cuda.cu(254): here

/opt/cudatoolkit/9.1/bin/../targets/x86_64-linux/include/thrust/iterator/iterator_traits.h(45): error: expected a ";"
      detected during:
        instantiation of class "thrust::iterator_traits<T> [with T=double]" 
/opt/cudatoolkit/9.1/bin/../targets/x86_64-linux/include/thrust/iterator/detail/iterator_traits.inl(53): here
        instantiation of class "thrust::iterator_difference<Iterator> [with Iterator=double]" 
/opt/cudatoolkit/9.1/bin/../targets/x86_64-linux/include/thrust/system/detail/sequential/binary_search.h(102): here
        instantiation of "ForwardIterator thrust::system::detail::sequential::upper_bound(thrust::system::detail::sequential::execution_policy<DerivedPolicy> &, ForwardIterator, ForwardIterator, const T &, StrictWeakOrdering) [with DerivedPolicy=thrust::detail::seq_t, ForwardIterator=double, T=double, StrictWeakOrdering=thrust::system::detail::generic::detail::binary_search_less]" 
/opt/cudatoolkit/9.1/bin/../targets/x86_64-linux/include/thrust/detail/binary_search.inl(83): here
        instantiation of "ForwardIterator thrust::upper_bound(const thrust::detail::execution_policy_base<DerivedPolicy> &, ForwardIterator, ForwardIterator, const T &, StrictWeakOrdering) [with DerivedPolicy=thrust::detail::seq_t, ForwardIterator=double, T=double, StrictWeakOrdering=thrust::system::detail::generic::detail::binary_search_less]" 
/opt/cudatoolkit/9.1/bin/../targets/x86_64-linux/include/thrust/system/detail/generic/binary_search.inl(225): here
        instantiation of "ForwardIterator thrust::system::detail::generic::upper_bound(thrust::execution_policy<DerivedPolicy> &, ForwardIterator, ForwardIterator, const T &) [with DerivedPolicy=thrust::detail::seq_t, ForwardIterator=double, T=double]" 
/opt/cudatoolkit/9.1/bin/../targets/x86_64-linux/include/thrust/detail/binary_search.inl(69): here
        instantiation of "ForwardIterator thrust::upper_bound(const thrust::detail::execution_policy_base<DerivedPolicy> &, ForwardIterator, ForwardIterator, const LessThanComparable &) [with DerivedPolicy=thrust::detail::seq_t, ForwardIterator=double, LessThanComparable=double]" 
Interpolation_cuda.cu(254): here

我想知道thrust是否支持使用CUDA向量类型,或者我只是做了一些错误的事情。

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2018-07-10 23:12:49

您需要满足推力算法的所有预期输入类型。你没有这么做,因为你定义的几乎每一个数量都不符合预期的推力。

首先,我们需要实际的迭代器。在设备代码中,这意味着指针。推力需要能够取消迭代器/指针的引用,然后你必须指示推力如何处理这个量。为此,我们需要一个适当定义的函子。您可能希望阅读推力快速启动导轨来理解函子的定义和用法。最后,这里的敏感指针/迭代器引用了一个double3类型,因此我们需要完成几乎所有的操作才能使用double3。注意,我们需要选择允许定义自定义函子的版本 of upper_bound,这样我们就可以正确地操作double3数量(当我们取消迭代器/指针的引用时得到的值)。

这可能有助于:

代码语言:javascript
复制
#include <thrust/binary_search.h>
#include <thrust/execution_policy.h>


struct my_comp_functor{
template <typename T>
__host__ __device__
  bool operator()(T &t1, T &t2) {
    return (t1.y < t2.y);}
};

__global__ void eos_search_gpu(const double3* y, const int my,
                           const double3* x, const int n,
                           const int dim_x, int * j, my_comp_functor my_comp){

    int i = threadIdx.x + blockDim.x * blockIdx.x;
    if ( i < my) {
      const double3 *ptr = thrust::upper_bound(thrust::seq, x, x+n, y[i], my_comp);
      j[i] = (ptr[0].y - x[i].y - 1);

    }
}

int main(){

  double3 *d_y, *d_x;
  int *d_j;

  cudaMalloc(&d_y, 1024);
  cudaMalloc(&d_x, 1024);
  cudaMalloc(&d_j, 1024);
  struct my_comp_functor my_obj;
  eos_search_gpu<<<1,1>>>(d_y, 0, d_x, 0, 0, d_j, my_obj);
  cudaDeviceSynchronize();
}

(以上代码在CUDA 9.2上编译时没有出现编译错误,但显然不是为了实用而设计的)

最后,我觉得奇怪的是,您正在将一个double数量插入到j[i] (一个整数)中,但它是您的代码。

而且,我可能在函子中错了顺序,所以可能需要将<更改为>

当您调用这个内核时,请注意,我添加了一个参数;您需要在主机代码中实例化一个my_comp_functor对象,然后在适当的位置将它传递给内核。

最后,看起来您正在进行向量化搜索,注意,might有可提供矢量化搜索,可以避免对这个内核的需求。

票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/51274740

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档