首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >如何在OpenACC上使用C++或OpenCV SubMatrix中的二维子矢量?

如何在OpenACC上使用C++或OpenCV SubMatrix中的二维子矢量?
EN

Stack Overflow用户
提问于 2020-07-01 00:51:48
回答 1查看 117关注 0票数 1

我有以下代码

代码语言:javascript
复制
int main(int argc, char** argv )
{
    std::cout<<"running Lenna..\n";
    cv::Mat mat = imread("lena.bmp", cv::IMREAD_GRAYSCALE );

    //convert to vec
    std::vector<double> BWvec;
    BWvec.assign((double*)mat.data, (double*)mat.data + mat.total());
    std::vector < std::vector<double>> vec2D;
    for (int i = 0; i < mat.rows; i++) {
        auto first = BWvec.begin() + (mat.rows * i);
        auto last = BWvec.begin() + (mat.rows * i) + mat.rows;
        std::vector<double> vec0(first, last);
        vec2D.push_back(vec0);
    }

    //#pragma acc parallel loop
    for (int i = 0; i <= 5; i++) {
        for (int j = 0; j <= 5; j++) {
            mat(cv::Rect(i,j, (4 - 0), (4 - 0)));

            //sub-vector[5:10][25:100]:
            std::vector<std::vector<double>> sub_vector;
            sub_vector.reserve(5);
            for (std::size_t k = 5; k < 10; ++k) {
                sub_vector.emplace_back(vec2D[k+i].begin() + 25, vec2D[k+i].begin() + 100);
            }
        }
    }

    return 0;
}

当我输入pgc++ -fast -ta=nvidia:cuda9.2,managed -Minfo=accel -o lenna lenna.cpp -std=c++11 pkg-config --c之母-libs opencv -lgomp && ./lenna时,它的工作顺序很好,但是当我取消对#pragma acc parallel loop的注释时,我就会得到错误。

代码语言:javascript
复制
procedures called in a compute region must have acc routine information
accelerator region ignored, accelerator restriction .. no acc routine information

如果我注释掉mat(cv::Rect(i,j,(4-0),(4-0)))并将该部分保留在sub-vector[5:10][25:100]之后,或者如果取消注释mat(cv::Rect(i,j,(4-0),(4-0)))并在sub-vector[5:10][25:100]后注释该部分,则也会出现此错误。

我怎么才能解决这个问题?

编辑

为了使这更简单,我提供了两个单独的代码,以及它们带来的错误:

lenna1.cpp

代码语言:javascript
复制
#include <stdio.h>
#include <cmath>
#include <omp.h>
#include <opencv2/opencv.hpp>

using namespace std;
using namespace cv;

//pgc++ -fast -ta=nvidia:cuda9.2,managed -Minfo=accel -o lenna lenna.cpp -std=c++11 `pkg-config --cflags --libs opencv` -lgomp && ./lenna

int main(int argc, char** argv )
{
    std::cout<<"running Lenna..\n";
    cv::Mat mat = imread("lena.bmp", cv::IMREAD_GRAYSCALE );

    #pragma acc parallel loop
    for (int i = 0; i <= 5; i++) {
        for (int j = 0; j <= 5; j++) {
            mat(cv::Rect(i, j, (4 - 0), (4 - 0)));
        }
    }
    return 0;
}

来自lenna1.cpp的错误

代码语言:javascript
复制
pgc++ -fast -ta=nvidia:cuda9.2,managed -Minfo=accel -o lenna1 lenna1.cpp -std=c++11 `pkg-config --cflags --libs opencv` -lgomp && ./lenna1
lenna1.cpp:
"lenna1.cpp", line 23: warning: last line of file ends without a newline
  }
   ^

PGCC-S-0155-Procedures called in a compute region must have acc routine information: cv::Mat::Mat(const cv::Mat&, const cv::Rect_<int> &) (lenna1.cpp: 379)
PGCC-S-0155-Accelerator region ignored; see -Minfo messages  (lenna1.cpp: 14)
main:
     14, Accelerator region ignored
         379, Accelerator restriction: call to 'cv::Mat::Mat(const cv::Mat&, const cv::Rect_<int> &)' with no acc routine information
PGCC/x86-64 Linux 19.10-0: compilation completed with severe errors

lenna2.cpp

代码语言:javascript
复制
#include <stdio.h>
#include <cmath>
#include <omp.h>
#include <opencv2/opencv.hpp>

using namespace std;
using namespace cv;

//pgc++ -fast -ta=nvidia:cuda9.2,managed -Minfo=accel -o lenna lenna.cpp -std=c++11 `pkg-config --cflags --libs opencv` -lgomp && ./lenna

int main(int argc, char** argv )
{
    std::cout<<"running Lenna..\n";
    cv::Mat mat = imread("lena.bmp", cv::IMREAD_GRAYSCALE );

    //convert to vec
    std::vector<double> BWvec;
    BWvec.assign((double*)mat.data, (double*)mat.data + mat.total());
    std::vector < std::vector<double>> vec2D;
    for (int i = 0; i < mat.rows; i++) {
        auto first = BWvec.begin() + (mat.rows * i);
        auto last = BWvec.begin() + (mat.rows * i) + mat.rows;
        std::vector<double> vec0(first, last);
        vec2D.push_back(vec0);
    }

    #pragma acc parallel loop
    for (int i = 0; i <= 5; i++) {
        for (int j = 0; j <= 5; j++) {
            //sub-vector[5:10][25:100]:
            std::vector<std::vector<double>> sub_vector;
            sub_vector.reserve(5);
            for (std::size_t i = 5; i < 10; ++i) {
                sub_vector.emplace_back(vec2D[i].begin() + 25, vec2D[i].begin() + 100);
            }
        }
    }

    return 0;
}

来自lenna2.cpp的错误

代码语言:javascript
复制
pgc++ -fast -ta=nvidia:cuda9.2,managed -Minfo=accel -o lenna2 lenna2.cpp -std=c++11 `pkg-config --cflags --libs opencv` -lgomp && ./lenna2
lenna2.cpp:
"lenna2.cpp", line 40: warning: last line of file ends without a newline
  }
   ^

operator new (unsigned long, void *):
      4, include "opencv.hpp"
          47, include "core.hpp"
               56, include "algorithm"
                    10, include "algorithm"
                         62, include "stl_algo.h"
                              62, include "stl_tempbuf.h"
                                   60, include "stl_construct.h"
                                        59, include "new"
                                            130, Generating implicit acc routine seq
                                                 Generating acc routine seq
                                                 Generating Tesla code
operator delete (void *, void *):
      4, include "opencv.hpp"
          47, include "core.hpp"
               56, include "algorithm"
                    10, include "algorithm"
                         62, include "stl_algo.h"
                              62, include "stl_tempbuf.h"
                                   60, include "stl_construct.h"
                                        59, include "new"
                                            135, Generating implicit acc routine seq
                                                 Generating acc routine seq
                                                 Generating Tesla code
PGCC-S-0155-Procedures called in a compute region must have acc routine information: std::__throw_length_error(const char *) (lenna2.cpp: 69)
PGCC-S-0155-Accelerator region ignored; see -Minfo messages  (lenna2.cpp: 25)
main:
     25, Accelerator region ignored
          69, Accelerator restriction: call to 'std::__throw_length_error(const char *)' with no acc routine information
PGCC/x86-64 Linux 19.10-0: compilation completed with severe errors
EN

回答 1

Stack Overflow用户

发布于 2020-07-01 14:58:35

为了从设备调用例程和方法,需要有这些例程的设备版本。在已知被调用例程定义的情况下(例如使用模板),编译器将试图隐式地生成设备例程。否则,程序员就有责任用OpenACC“例程”指令来修饰调用的例程。

由于您提供的信息是不完整的,因此很难确切知道如何修复您的代码。错误信息说缺少哪些例程?你能提供一个完整的再现例子吗?

在更新后进行编辑。

调用'cv::Mat::Mat(const cv::Mat&,const cv::Rect_ &)‘,没有acc例程信息

看起来像"Mat“类型的构造函数没有可调用的设备版本。虽然我不熟悉OpenCV的结构,但我假设这不是模板化的,也不是标头中包含的构造函数的定义,因此编译器可以隐式地创建它。您需要将例程指令添加到您希望从设备代码调用的OpenCV部分,或者如果有CUDA设备例程,则可以使用带bind子句的OpenACC例程指令来调用它们。

69,加速器限制:调用'std::__throw_length_error(const *)‘而没有acc例程信息

异常处理对于设备代码是不可用的,因为它需要在主机上捕获,而且目前还没有一种方法来支持这一点。

在某些情况下,您可以通过标记"--no_ exceptions“禁用异常来解决这个问题,但是在这种情况下,如果禁用异常,OpenCV就会发出抱怨。因此,最好避免在这里的设备上使用向量。

票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/62668105

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档