文章/答案/技术大牛

发布

社区首页 >问答首页 >acc并行区内的例程

问acc并行区内的例程
EN

Stack Overflow用户

提问于 2020-09-29 12:17:22

回答 1查看 443关注 0票数 0

在阅读了这个how-can-a-fortran-openacc-routine-call-another-fortran-openacc-routine之后，我仍然对这个OpenACC函数调用的限制感到困惑。

下面是来自上面链接帖子的修改后的无意义代码：

PROGRAM Test
IMPLICIT NONE

CONTAINS

 SUBROUTINE OuterRoutine( N )
 !$acc routine
   IMPLICIT NONE
   INTEGER :: N
   real :: y
   INTEGER :: i

      DO i = 0, N
         call InnerRoutine( y )
      ENDDO

 END SUBROUTINE OuterRoutine

 subroutine InnerRoutine( y )
 !$acc routine
   IMPLICIT NONE

   real :: y

 END subroutine InnerRoutine

END PROGRAM Test

当我用nvfortran版本20.7编译它时，我得到了

$ nvfortran -acc -Minfo routine.f90
outerroutine:
     14, Generating acc routine seq
         Generating Tesla code
     22, Reference argument passing prevents parallelization: y
innerroutine:
     27, Generating acc routine seq
         Generating Tesla code
nvvmCompileProgram error 9: NVVM_ERROR_COMPILATION.
Error: /tmp/pgaccr22eZDXceweL.gpu (43, 14): parse invalid forward reference to function '_innerroutine_' with wrong type!
ptxas /tmp/pgaccH22eJTMb0hKD.ptx, line 1; fatal   : Missing .version directive at start of file '/tmp/pgaccH22eJTMb0hKD.ptx'
ptxas fatal   : Ptx assembly aborted due to errors
NVFORTRAN-S-0155-Compiler failed to translate accelerator region (see -Minfo messages): Device compiler exited with error status code (routine_inline.f90: 1)
  0 inform,   0 warnings,   1 severes, 0 fatal for

是什么触发了编译错误？作为比较，下面的代码使用acc函数调用

module data
   integer, parameter :: maxl = 100000
   real, dimension(maxl) :: xstat
   real, dimension(:), allocatable :: yalloc
   !$acc declare create(xstat,yalloc)
   logical :: IsUsed
   !$acc declare create(IsUsed)
 end module
 
 module useit
   use data
 contains
   subroutine compute(n)
      integer :: n
      integer :: i
      !$acc parallel loop present(yalloc,xstat)
      do i = 1, n
         call iprocess(i, yalloc)
      enddo
   end subroutine
   
   subroutine iprocess(i, yalloc)
      !$acc routine seq
      integer :: i
      real,intent(out) :: yalloc(:)
      if(IsUsed) call kernel(i,yalloc)

      contains

      subroutine kernel(i,yalloc)
        !$acc routine seq
        integer, intent(in) :: i
        real,intent(out) :: yalloc(:)
        yalloc(i) = 2*xstat(i)
      end subroutine

   end subroutine 

 end module
 
 program main
 
   use data
   use useit
 
   implicit none
 
   integer :: nSize = 100
   !---------------------------------------------------------------------------
 
   call allocit(nSize)
   call initialize
 
   call compute(nSize)
 
   !$acc update self(yalloc) 
   write(*,*) "yalloc(10)=",yalloc(10) ! 3
 
   call finalize
   
 contains
   subroutine allocit(n)
     integer :: n
     allocate(yalloc(n))
   end subroutine allocit
   
   subroutine initialize
     xstat = 1.0
     yalloc = 1.0
     IsUsed = .true.
     !$acc update device(xstat,yalloc,IsUsed)
   end subroutine initialize
 
   subroutine finalize
 
     deallocate(yalloc)
     
   end subroutine finalize
   
 end program main

可以用OpenACC编译并运行。

更新:令人惊讶的是，对于第一段代码，当我简单地切换子例程的顺序时，它可以工作：

PROGRAM Test
IMPLICIT NONE

CONTAINS

 subroutine InnerRoutine( y )
 !$acc routine
   IMPLICIT NONE

   real :: y

 END subroutine InnerRoutine

 SUBROUTINE OuterRoutine( N )
 !$acc routine
   IMPLICIT NONE
   INTEGER :: N
   real :: y
   INTEGER :: i

      DO i = 0, N
         call InnerRoutine( y )
      ENDDO

 END SUBROUTINE OuterRoutine

END PROGRAM Test

这对我来说似乎真的很令人惊讶，这一点依赖于例程排序。但是为什么它适用于我上面的第二个例子呢？

openacc

pgi-accelerator

回答 1

Stack Overflow用户

发布于 2020-09-29 23:33:27

这是一个编译器设备代码生成错误。当从"OuterRoutine“调用"InnerRoutine”时，编译器正确地将隐藏参数添加到父堆栈中，但"InnerRoutine“的定义缺少它作为实际参数。错误是被调用者和调用者之间不匹配。

我添加了一个问题报告，TPR #29057。不清楚这是一个更广泛的问题，还是小测试用例的产物。

注意，请注意使用包含的设备子例程。Fortran允许通过传入指向父级堆栈的指针来访问父级的局部变量。如果父进程在主机上，子进程在设备上，则直接访问父进程的变量将导致运行时错误。例如，如果"iprocess“包含在"compute”中，而您直接访问"i“，而不是将其作为参数传递，则会收到错误，因为设备无法访问主机的堆栈。

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/64112821

复制

相似问题

问acc并行区内的例程
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问acc并行区内的例程EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问acc并行区内的例程
EN