文章/答案/技术大牛

发布

社区首页 >问答首页 >如何将grid_sample模型转换为INT8量化的TensorRT模型？

问如何将grid_sample模型转换为INT8量化的TensorRT模型？
EN

Stack Overflow用户

提问于 2021-09-13 11:52:49

回答 1查看 734关注 0票数 0

我试图通过ONNX (opset 11)将torch.nn.functional.grid_sample的模型从Pytorch (1.9)转换为使用INT8量化的INT8 (7)。Opset 11不支持将grid_sample转换为ONNX。因此，我将ONNX图形外科医生与外部GridSamplePlugin一起使用，因为它是建议在此。有了它，转换到TensorRT (有和没有INT8量化)是成功的。没有INT8量化的Pytorch和TRT模型提供的结果接近相同的结果(MSE为e-10阶)。但对于具有TensorRT量化的INT8，均方误差要高得多(185)。

grid_sample算子得到两个输入:输入信号和采样网格。两者都应该是同一类型的。在GridSamplePlugin中，只实现了kFLOAT和kHALF的处理。在我的示例中，绝对采样网格中的X坐标(在转换为grid_sample所需的相对坐标之前)在范围内( -d；W+d和-d；H+d用于Y坐标)正在发生变化。W的最大值为640，H的最大值为360，在此范围内坐标可能具有非整数值。出于测试目的，我创建了只包含grid_sample层的测试模型。在这种情况下，TensorRT结果与不带INT8量化的结果是相同的。

下面是测试模型的代码：

import torch
import numpy as np
import cv2

BATCH_SIZE = 1
WIDTH = 640
HEIGHT = 360

def calculate_grid(B, H, W, dtype, device='cuda'):
    xx = torch.arange(0, W, device=device).view(1, -1).repeat(H, 1).type(dtype)
    yy = torch.arange(0, H, device=device).view(-1, 1).repeat(1, W).type(dtype)
    xx = xx + yy * 0.25
    if B > 1:
        xx = xx.view(1, 1, H, W).repeat(B, 1, 1, 1)
        yy = yy.view(1, 1, H, W).repeat(B, 1, 1, 1)
    else:
        xx = xx.view(1, 1, H, W)
        yy = yy.view(1, 1, H, W)
    vgrid = torch.cat((xx, yy), 1).type(dtype)
    return vgrid.type(dtype)

def modify_grid(vgrid, H, W):
    vgrid = torch.cat([
        torch.sub(2.0 * vgrid[:, :1, :, :].clone() / max(W - 1, 1), 1.0),
        torch.sub(2.0 * vgrid[:, 1:2, :, :].clone() / max(H - 1, 1), 1.0),
        vgrid[:, 2:, :, :]], dim=1)
    vgrid = vgrid.permute(0, 2, 3, 1)
    return vgrid

class GridSamplingBlock(torch.nn.Module):

    def __init__(self):
        super(GridSamplingBlock, self).__init__()

    def forward(self, input, vgrid):
        output = torch.nn.functional.grid_sample(input, vgrid)
        return output

if __name__ == '__main__':
    model = torch.nn.DataParallel(GridSamplingBlock())
    model.cuda()
    print("Reading inputs")
    img = cv2.imread("result/left_frame_rect_0373.png")
    img = cv2.resize(cv2.cvtColor(img, cv2.COLOR_BGR2GRAY), (WIDTH, HEIGHT))
    img_in = torch.from_numpy(img.astype(float)).view(1, 1, HEIGHT, WIDTH).cuda()
    vgrid = calculate_grid(BATCH_SIZE, HEIGHT, WIDTH, img_in.dtype)
    vgrid = modify_grid(vgrid, HEIGHT, WIDTH)
    np.save("result/grid", vgrid.cpu().detach().numpy())
    print("Getting output")
    with torch.no_grad():
        model.module.eval()
        img_out = model.module(img_in, vgrid)
        img = img_out.cpu().detach().numpy().squeeze()
        cv2.imwrite("result/grid_sample_test_output.png", img.astype(np.uint8))

TensorRT模型的标定和推理都采用了保存的网格。

所以问题是：

将INT8量化应用于具有至少一个索引输入的函数(如grid_sample)有效吗？这样的量化不是会导致结果的显着变化吗(例如，如果我们将INT8量化应用于范围为[0.640]的输入)？
如果只在这个插件代码中实现了INT8和FP16，那么如何使用自定义插件进行量化呢？
在TensorRT中，由于grid_sample输入实际上是网络输入，在INT8量化和不量化的情况下，是否获得了相同的测试网络结果？

我的环境：

TensorRT版本:7
GPU型: NVidia GeForce GTX 1050 Ti
Nvidia驱动程序版本: 470.63.01
CUDA版本: 10.2.89
CUDNN版本: 8.1.1
操作系统+版本: Ubuntu 18.04
Python版本(如适用)：3.7
PyTorch版本(如适用)：1.9

复制步骤：

运行测试代码以保存网格并获得火炬结果。使用任何输入图像进行测试。
根据这个TensorRT，用自定义插件构建示例操作系统。TRT的最新版本需要对GridSamplePlugin进行一些调整，因此最好使用推荐的TensorRT OSS版本。
根据代码示例创建ONNX模型。
创建带有或不带TensorRT量化的INT8引擎，并运行推理。在我的C++代码中，我使用https://github.com/llohse/libnpy读取grid.npy文件。

pytorch

onnx

quantization

tensorrt

回答 1

Stack Overflow用户

发布于 2022-03-05 16:12:00

您可以将模型分解为两个部分，一个在网格样本之前，另一个在它之后，并分别进行int8量化。在grid_sample中使用INT8将极大地损害您的模型性能。这将导致您的网络结构发生变化，因此它可能会更改图形的优化。

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/69162186

复制

相似问题

问如何将grid_sample模型转换为INT8量化的TensorRT模型？
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何将grid_sample模型转换为INT8量化的TensorRT模型？EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何将grid_sample模型转换为INT8量化的TensorRT模型？
EN