文章/答案/技术大牛

发布

社区首页 >问答首页 >不记录nvprof -分析数据。

问不记录nvprof -分析数据。
EN

Stack Overflow用户

提问于 2019-07-01 10:36:15

回答 1查看 2.5K关注 0票数 0

我试图使用nvprof工具来分析我的数据自动化系统程序。

这是我的代码：

#include <iostream>
#include <math.h>
#include <cuda_profiler_api.h>

// Kernel function to add the elements of two arrays
__global__
void add(int n, float *x, float *y)
{
  int index = blockIdx.x * blockDim.x + threadIdx.x;
  int stride = blockDim.x * gridDim.x;
  for (int i = index; i < n; i += stride)
    y[i] = x[i] + y[i];
}

int main(void)
{
  int N = 1<<10;
  float *x, *y;

  // Allocate Unified Memory – accessible from CPU or GPU
  cudaMallocManaged(&x, N*sizeof(float));
  cudaMallocManaged(&y, N*sizeof(float));

  // initialize x and y arrays on the host
  for (int i = 0; i < N; i++) {
    x[i] = 1.0f;
    y[i] = 2.0f;
  }

  // Run kernel on 1M elements on the GPU
int blockSize = 256;
int numBlocks = (N + blockSize - 1) / blockSize;
add<<<numBlocks, blockSize>>>(N, x, y);

  // Wait for GPU to finish before accessing on host
  cudaDeviceSynchronize();

  // Check for errors (all values should be 3.0f)
  float maxError = 0.0f;
  for (int i = 0; i < N; i++)
    maxError = fmax(maxError, fabs(y[i]-3.0f));
  std::cout << "Max error: " << maxError << std::endl;

  // Free memory
  cudaFree(x);
  cudaFree(y);

  cudaProfilerStop();
  cudaDeviceReset();

  return 0;
}

我使用命令nvcc add.cu -o add_cuda编译它。

然后使用nvprof ./add_cuda运行它(作为root) --统一的-内存分析关闭或nvprof，并获得以下输出：

==15318== NVPROF is profiling process 15318, command: ./add_cuda
Max error: 0
==15318== Profiling application: ./add_cuda
==15318== Profiling result:
No kernels were profiled.
No API activities were profiled.
==15318== Warning: Some profiling data are not recorded. Make sure cudaProfilerStop() or cuProfilerStop() is called before application exit to flush profile data.
======== Error: Application received signal 139

我在网上搜索解决方案nvprof not picking up any API calls or kernels，https://devtalk.nvidia.com/default/topic/1010691/visual-profiler/nvprof-error-code-139-but-memcheck-ok/，但实际上没有任何帮助。

我怎么才能让nvprof工作呢？

谢谢!

Fedora 29 64位

nvprof: NVIDIA (R) Cuda command line profiler
Copyright (c) 2012 - 2019 NVIDIA Corporation
Release version 10.1.168 (21)

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Wed_Apr_24_19:10:27_PDT_2019
Cuda compilation tools, release 10.1, V10.1.168

nvidia-smi

Mon Jul  1 13:24:54 2019       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 430.26       Driver Version: 430.26       CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 108...  Off  | 00000000:03:00.0  On |                  N/A |
|  0%   37C    P8    20W / 250W |    253MiB / 11175MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      1741      G   /usr/libexec/Xorg                            154MiB |
|    0      2161      G   cinnamon                                      96MiB |
+-----------------------------------------------------------------------------+

cuda

nvcc

nvprof

回答 1

Stack Overflow用户

发布于 2019-09-26 11:58:37

在使用nvprof版本10.1时，我得到了相同的结果，但它适用于10.0。

例如，尝试下载CUDA 10.0并使用该版本的nvprof。即使用nvcc版本10.1编译，也适用于我。

票数 3

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/56834164

复制

相似问题

问不记录nvprof -分析数据。
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问不记录nvprof -分析数据。EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问不记录nvprof -分析数据。
EN