问(SYCL) local_accessor问题(没有名为‘local_accessor’的模板)
EN

Stack Overflow用户

提问于 2021-10-05 18:14:03

回答 1查看 61关注 0票数 0

您好，我对c++和sycl很陌生。因此，请尽可能具体。下面是我尝试编译的代码：

/*
    Intel oneAPI DPC++
    dpcpp -Qstd=c++17 /EHsc hellocl.cpp -Qtbb opencl.lib -o d.exe

    Microsoft C++ Compiler
    cl /EHsc /std:c++17 hellocl.cpp opencl.lib  /Fe: m.exe

    clang++ -std=c++17 hellocl.cpp -ltbb -lopencl -o c.exe

    g++ -std=c++17 hellocl.cpp -ltbb -lopencl -o c.exe
*/
/*
    1. How to use Random Number Generator
    2. How to use std::vector as 2-dimensional array
    3. How to suppress warning in clang compiler
    4. How to use Tpf_FormatWidth, Tpf_FormatPrecision macros

    dpcpp naive.cpp tbbmalloc.lib -o d.exe
*/

#if defined(__clang__)
#pragma clang diagnostic push
#pragma clang diagnostic ignored "-Wpass-failed"
#endif

#define Tpf_FormatWidth 6
#define Tpf_FormatPrecision 4

#include "tpf_linear_algebra.hpp"
#include <cl/sycl.hpp>

namespace chr = tpf::chrono_random;
namespace mtx = tpf::matrix;

tpf::sstream stream;
auto& nl = tpf::nl; // single carriage-return
auto& nL = tpf::nL; // two carriage-return

auto& endl = tpf::endl; // sing carriage-return and flush out to console
auto& endL = tpf::endL; // two carriage-returns and flush out to console

void test_random_number_generator()
{
    using element_t = double;
    using matrix_t = mtx::scalable_fast_matrix_t<element_t>;

    size_t N = 10; // number of rows
    size_t M = N; // number of columns

    matrix_t A{ N, M }; // N x M matrix
    matrix_t B{ N, M };

    // we created a random number generator
    // <int> means we generator integer
    // (-10, 10) means from -10 to 10, inclusive
    auto generator = chr::random_generator<int>(-10, 10);

    chr::random_parallel_fill(A.array(), generator);
    chr::random_parallel_fill(B.array(), generator);

    auto C = A * B; // matrix multiplication

    stream << "A = " << nl << A << endl;
    stream << "B = " << nl << B << endl;
    stream << "A x B = " << nl << C << endL;

}

void test_naive_matrix_multiplication()
{
    size_t N = 10;
    size_t M = N;

    using element_t = double;
    using vectrix_t = std::vector<element_t>;

    vectrix_t A(N * M);
    vectrix_t B(N * M);
    vectrix_t C(N * M);
    vectrix_t D(N * M);

    auto generator = chr::random_generator<int>(-10, 10);

    chr::random_parallel_fill(A, generator);
    chr::random_parallel_fill(B, generator);

    auto out_A = mtx::create_formatter(A, N, M);
    auto out_B = mtx::create_formatter(B, N, M);
    auto out_C = mtx::create_formatter(C, N, M);
    auto out_D = mtx::create_formatter(D, N, M);

    stream << "A = " << nl << out_A() << endl;
    stream << "B = " << nl << out_B() << endl;

    auto idx_A = mtx::create_indexer(A, N, M);
    auto idx_B = mtx::create_indexer(B, N, M);
    auto idx_C = mtx::create_indexer(C, N, M);

    for (int i = 0; i < (int)N; ++i)
    {
        for (int j = 0; j < (int)M; ++j)
        {
            for (int k = 0; k < (int)M; ++k)
                idx_C(i, j) += idx_A(i, k) * idx_B(k, j); // matrix multiplication
        }
    }



    stream << "CPU: A x B = " << nl << out_C() << endl;

    
        sycl::queue queue{ sycl::gpu_selector{} };

        sycl::buffer buf_A{ &A[0], sycl::range{N, M} };
        sycl::buffer buf_B{ &B[0], sycl::range{N, M} };
        sycl::buffer buf_D{ &D[0], sycl::range{N, M} };

        queue.submit([&](sycl::handler& cgh)
            {
                auto a = buf_A.get_access<sycl::access::mode::read>(cgh);
                auto b = buf_B.get_access<sycl::access::mode::read>(cgh);
                auto d = buf_D.get_access<sycl::access::mode::read_write>(cgh);

                constexpr int tile_size = 16;
            local_accessor<int> tileA{tile_size, cgh};

            cgh.parallel_for(
nd_range<2>{{N, N}, {1, tile_size}}, [=](nd_item<2> it) {
// Indices in the global index space:
                int m = it.get_global_id()[0];
                int n = it.get_global_id()[1];
// Index in the local index space:
                int i = it.get_local_id()[1];
                size_t sum = 0;
                for (int kk = 0; kk < 496; kk += tile_size) {
// Load the matrix tile from matrix A, and synchronize
// to ensure all work-items have a consistent view
// of the matrix tile in local memory.
                    tileA[i] = a[m][kk + i];
                    it.barrier();
// Perform computation using the local memory tile, and
// matrix B in global memory.
                    for (int k = 0; k < tile_size; k++)
                        sum += tileA[k] * b[kk + k][n];
                
            
// After computation, synchronize again, to ensure all
// reads from the local memory tile are complete.
                it.barrier();
            }
        
// Write the final result to global memory.
            d[m][n] = sum;
        });
        
    });


       

        // when this block goes off,
        // the destructor of buf_D waits until it is released by the queue
        // and copies to its host memory D
    

    stream << "GPU: A x B = " << nl << out_D() << endl;
}

#if defined(__clang__)
#pragma clang diagnostic pop
#endif 

int main()
{
    // test_random_number_generator();

    test_naive_matrix_multiplication();
}

这是一个简单的矩阵乘法代码，我正在尝试在这个sycl应用程序中使用ND_kernel。但是当我试图编译它的时候，我得到了像这样的错误：

hellocl.cpp(128,13)：错误:没有名为' local_accessor‘的模板local_accessor tileA{tile_size，cgh}；^ hellocl.cpp(131,1)：错误:使用了未声明的标识符'nd_range’nd_range<2>{{N，N}，{1，tile_size}}，生成了= {^2错误。

数据并行示例来自https://www.khronos.org/developers/books/的“ND_range C++”一书。从第225页开始。

sycl

回答 1

Stack Overflow用户

发布于 2021-10-06 11:11:51

错误日志中提到的类(local_accessor、nd_range、nd_item)是SYCL的一部分，因此包含在sycl::名称空间中。您需要为这些类型加上sycl::前缀，这与使用缓冲区和队列的方式类似。所以，不要只写local_accessor，而要写sycl::local_accessor。

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/69455391

复制

相似问题

问(SYCL) local_accessor问题(没有名为‘local_accessor’的模板)
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问(SYCL) local_accessor问题(没有名为‘local_accessor’的模板)EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问(SYCL) local_accessor问题(没有名为‘local_accessor’的模板)
EN