文章/答案/技术大牛

发布

社区首页 >问答首页 >如何使用sRGB将NV12格式转换成NumPy格式？

问如何使用sRGB将NV12格式转换成NumPy格式？
EN

Stack Overflow用户

提问于 2019-07-13 20:33:38

回答 1查看 5K关注 0票数 7

NV12格式定义了特定颜色通道排序的YUV颜色空间与420个次抽样。

NV12格式主要用于视频编解码流水线。

NV12的libyuv描述：

NV12是一个双平面格式，具有一个全尺寸的Y平面，然后是一个具有编织U和V值的单一色度平面。NV21是相同的，但具有编织的V和U值。NV12中的12位是指每像素12位。NV12具有半宽半高色度通道，因此是420次采样。

在NV12环境下，YUV格式主要被称为YCbCr颜色空间。

NV12元素为每个元素8位(uint8类型)。

在这篇文章中，YUV元素处于“有限范围”标准:y范围为16,235，U，V范围为16,240。

标准红绿蓝( sRGB，标准红绿蓝)是PC系统使用的标准颜色空间。

在post的上下文中，sRGB颜色组件范围是0,255。

RGB元素排序与post无关(假设3个彩色平面)。

目前至少有2种可能的YCbCr格式应用于NV12：

BT.601 -应用SDTV。
BT.709 -应用高清晰度电视。

NV12元素排序示例：

YYYYYY

UVUVUV

RGB到NV12的转换可以通过以下几个阶段来描述：

颜色空间转换-转换从sRGB到YUV颜色空间。
色度下降采样-缩小U，V通道的因子x2在每个轴(转换从YUV444到YUV420)。
色度元素交错-排列U，V元素为U，V，U，V。

下图说明了应用6x6像素图像大小的转换阶段：

如何使用NumPy将sRGB转换为NV12？

注意：

这个问题涉及说明转换过程的Python实现(post不是针对OpenCV实现这样的现有函数的)。

numpy

image-processing

video-processing

nv12-nv21

python

回答 1

Stack Overflow用户

回答已采纳

发布于 2019-07-13 20:33:38

使用sRGB将NV12格式转换为NumPy格式

该员额的目的是演示转换过程。

下面的Python实现使用NumPy，并且有意避免使用OpenCV。

RGB到NV12转换阶段：

颜色空间转换-从sRGB转换到YUV颜色空间：使用sRGB到YCbCr转换公式。将每个RGB三倍乘以3x3转换矩阵，并添加一个3偏移的向量。 post显示了BT.709和BT.601的转换(唯一的区别是系数矩阵)。
色度下降采样-缩小U，V通道的因子x2在每个轴(转换从YUV444到YUV420)。实现在每个轴上使用双线性插值使U，V的大小为0.5倍. 注:双线性插值不是最优的下采样方法，但通常足够好. 代码使用的不是cv2.resize，而是每2x2像素的平均值(结果相当于双线性插值)。注意:如果输入分辨率在两个维度上都不是，实现就会失败。
色度元素交错-排列U，V元素为U，V，U，V。通过数组索引操作实现。

下面是一个Python代码示例，用于将RGB转换为NV12标准：

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import subprocess as sp  # The module is used for testing (using FFmpeg as reference).

do_use_bt709 = True  # True for BT.709, False for BT.601

rgb = mpimg.imread('rgb_input.png')*255.0   # Read RGB input image, multiply by 255 (set RGB range to [0, 255]).
r, g, b = np.squeeze(np.split(rgb, 3, -1))  # Split RGB to R, G and B numpy arrays.
rows, cols = r.shape

# I. Convert RGB to YUV (convert sRGB to YUV444)
#################################################
if do_use_bt709:
    # Convert sRGB to YUV, BT.709 standard
    # Conversion formula used: 8 bit sRGB to "limited range" 8 bit YUV (BT.709).            
    y =  0.1826*r + 0.6142*g + 0.0620*b + 16
    u = -0.1006*r - 0.3386*g + 0.4392*b + 128
    v =  0.4392*r - 0.3989*g - 0.0403*b + 128
else:
    # Convert sRGB to YUV, BT.601 standard.
    # Conversion formula used: 8 bit sRGB to "limited range" 8 bit YUV (BT.601).
    y =  0.2568*r + 0.5041*g + 0.0979*b + 16
    u = -0.1482*r - 0.2910*g + 0.4392*b + 128
    v =  0.4392*r - 0.3678*g - 0.0714*b + 128


# II. U,V Downscaling (convert YUV444 to YUV420)
##################################################
# Shrink U and V channels by a factor of x2 in each axis (use bi-linear interpolation).
#shrunk_u = cv2.resize(u, (cols//2, rows//2), interpolation=cv2.INTER_LINEAR)
#shrunk_v = cv2.resize(v, (cols//2, rows//2), interpolation=cv2.INTER_LINEAR)

# Each element of shrunkU is the mean of 2x2 elements of U
# Result is equivalent to resize by a factor of 0.5 with bi-linear interpolation.
shrunk_u = (u[0::2, 0::2] + u[1::2, 0::2] + u[0::2, 1::2] + u[1::2, 1::2]) * 0.25
shrunk_v = (v[0::2, 0::2] + v[1::2, 0::2] + v[0::2, 1::2] + v[1::2, 1::2]) * 0.25


# III. U,V Interleaving
########################
# Size of UV plane is half the number of rows, and same number of columns as Y plane.
uv = np.zeros((rows//2, cols))  # Use // for integer division.

# Interleave shrunkU and shrunkV and build UV plane (each row of UV plane is u,v,u,u,v...)
uv[:, 0::2] = shrunk_u
uv[:, 1::2] = shrunk_v

# Place Y plane at the top, and UV plane at the bottom (number of rows NV12 matrix is rows*1.5)
nv12 = np.vstack((y, uv))

# Round NV12, and cast to uint8.
nv12 = np.round(nv12).astype('uint8')

# Write NV12 array to binary file
nv12.tofile('nv12_output.raw')

# Display NV12 result (display as Grayscale image).
plt.figure()
plt.axis('off')
plt.imshow(nv12, cmap='gray', interpolation='nearest')
plt.show()


# Testing - compare the NV12 result to FFmpeg conversion result:
################################################################################
color_matrix = 'bt709' if do_use_bt709 else 'bt601'

sp.run(['ffmpeg', '-y', '-i', 'rgb_input.png', '-vf', 
        f'scale=flags=fast_bilinear:out_color_matrix={color_matrix}:out_range=tv:dst_format=nv12',
        '-pix_fmt', 'nv12', '-f', 'rawvideo', 'nv12_ffmpeg.raw'])

nv12_ff = np.fromfile('nv12_ffmpeg.raw', np.uint8)
nv12_ff = nv12_ff.reshape(nv12.shape)

abs_diff = np.absolute(nv12.astype(np.int16) - nv12_ff.astype(np.int16)).astype(np.uint8)
max_abs_diff = abs_diff.max()

print(f'max_abs_diff = {max_abs_diff}')

plt.figure()
plt.axis('off')
plt.imshow(abs_diff, cmap='gray', interpolation='nearest')
plt.show()
################################################################################

示例RGB输入图像：

NV12结果(显示为灰度图像)：

测试：

为了进行测试，我们使用命令行工具(rgb_input.png)将相同的输入图像( NV12 )转换为NV12格式，并计算出两种转换之间的最大绝对差。

测试假设FFmpeg处于执行路径(在ffmpeg.exe中，我们可以将ffmpeg.exe放在与Python相同的文件夹中)。

下面的shell命令使用BT.709颜色标准将rgb_input.png转换为NV12格式：

ffmpeg -y -i rgb_input.png -vf "scale=flags=fast_bilinear:out_color_matrix=bt709:out_range=tv:dst_format=nv12" -pix_fmt nv12 -f rawvideo nv12_ffmpeg.raw

注意：

fast_bilinear插值对特定输入图像的插值效果最好--在U和V降尺度时应用双线性插值。

下面的Python代码将nv12_ffmpeg.raw与nv12_ffmpeg.raw进行比较

nv12_ff = np.fromfile('nv12_ffmpeg.raw', np.uint8).reshape(nv12.shape)
abs_diff = np.absolute(nv12.astype(np.int16) - nv12_ff.astype(np.int16)).astype(np.uint8)
print(f'max_abs_diff = {abs_diff.max()}')

对于特定的输入图像，最大的差异是2或3 (几乎相同)。

对于其他输入图像，差异更大(可能是由于错误的FFmpeg参数造成的)。

票数 6

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/57022633

复制

相似问题

问如何使用sRGB将NV12格式转换成NumPy格式？
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何使用sRGB将NV12格式转换成NumPy格式？EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何使用sRGB将NV12格式转换成NumPy格式？
EN