HK1251970B

HK1251970B - Decoder and method for generating high dynamic range images, and storage medium

Info

Publication number: HK1251970B
Application number: HK18111288.8A
Authority: HK
Inventors: 曲晟; 尹鹏; 叶琰; 贺玉文; 沃尔特．吉什; 苏冠铭; 袁玉斐; 萨米尔．胡利亚尔卡尔
Original assignee: 杜比实验室特许公司
Priority date: 2012-01-03
Filing date: 2018-09-04
Publication date: 2020-12-11

Description

Decoder, method and storage medium for generating high dynamic range images

本申请是申请日为2012年12月18日、申请号为“201280066010.4”、发明名称为“指定视觉动态范围编码操作及参数”的发明专利申请的分案申请。This application is a divisional application of the invention patent application with the application date of December 18, 2012, application number "201280066010.4", and invention name "Specified visual dynamic range encoding operations and parameters".

相关申请的交叉引用CROSS-REFERENCE TO RELATED APPLICATIONS

本申请要求于2013年1月3日提交的美国临时申请No.61/582,614的优先权，在此通过引用将其公开内容整体合并到本文中。This application claims priority to U.S. Provisional Application No. 61/582,614, filed January 3, 2013, the disclosure of which is hereby incorporated by reference in its entirety.

技术领域Technical Field

本申请一般涉及视频编码系统，更具体地涉及对视觉动态范围图像进行编码、解码及表示的系统。The present application relates generally to video coding systems, and more particularly to systems for encoding, decoding, and presenting visual dynamic range images.

背景技术Background Art

已经开发了支持基于具体视频格式来发送和呈现视频内容的显示技术。例如，MPEG视频编码器和解码器可以支持以MPEG视频格式编码的视频内容。其他视频编码器和解码器可以支持以不同视频格式编码的视频内容。Display technologies have been developed to support sending and presenting video content based on specific video formats. For example, MPEG video encoders and decoders can support video content encoded in the MPEG video format. Other video encoders and decoders can support video content encoded in different video formats.

消费者设备如手持设备通常安装有或配置有一组有限的视频编码系统，其中每个视频编码系统可以支持一组有限的视频格式中的特定视频格式。因此，如果没有按照预期的视频格式对视频内容块进行编码和传送，则该设备将有可能无法找到合适的视频解码器来对视频内容进行解码和帮助呈现视频内容。即使呈现了视频内容，所呈现的视频内容也可能包括对所接收的视频内容的不正确的解释或表示，并且产生颜色和亮度值的可见伪影。WO2008/077272A1、WO2008/083521A1、WO2009/051692A2和WO2010/105036A1描述了以可伸缩或分层方式对视频数据进行的编码。Consumer devices, such as handheld devices, are typically installed or configured with a limited set of video encoding systems, each of which may support a specific video format from a limited set of video formats. Consequently, if a block of video content is not encoded and transmitted in accordance with the expected video format, the device may not be able to find a suitable video decoder to decode the video content and assist in presenting the video content. Even if the video content is presented, the presented video content may include an incorrect interpretation or representation of the received video content and produce visible artifacts of color and brightness values. WO2008/077272A1, WO2008/083521A1, WO2009/051692A2, and WO2010/105036A1 describe encoding video data in a scalable or layered manner.

本部分中所描述的方法是可以实行的方法，但并不一定是先前已经构思或实行的方法。因此，除非另外指出，否则不应当仅仅因为被包括在本部分中就假设本部分中所描述的任何方法是现有技术。相似地，除非另外指出，否则不应基于本部分而假设关于一种或更多种方法而确定的问题已经在任何现有技术中被意识到。The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any approach described in this section is prior art simply by virtue of its inclusion in this section. Similarly, unless otherwise indicated, it should not be assumed based on this section that a problem identified with respect to one or more approaches has been recognized in any prior art.

发明内容Summary of the Invention

根据一个实施例，提供一种视频编码方法，包括：接收输入视觉动态范围图像以及与所述输入视觉动态范围图像相关联的输入基本层图像；生成用于与编码有关的操作的包括至少帧级的多个语法元素的编码语法；根据所述编码语法将所述输入基本层图像和所述输入视觉动态范围图像转换成基本层数据和增强层数据；其中所述增强层数据包括所述输入视觉动态范围图像与基于所述基本层数据生成的预测视觉动态范围图像之间的残差值；其中所述多个语法元素指定以下操作中的一个或更多个：色度重采样操作、逆映射操作、基于非重叠区域的预测操作、基于重叠区域的预测操作、残差非线性量化和去量化操作、残差色度重采样操作、残差空间上采样操作、包括插值的数据处理操作，执行这些操作以生成所述基本层数据和所述增强层数据；将所述编码语法元素编码成参考处理单元数据，以使得相应解码器能够将相同的编码语法用于与解码有关的操作；其中所述参考处理单元数据包括当前参考处理单元数据单元；其中所述当前参考处理单元数据单元包括参考处理单元数据报头和参考处理单元数据有效载荷；在所述当前参考处理单元数据单元中，将所述编码语法中的所述多个语法元素中的至少一个语法元素指示为能够根据先前输入视觉动态范围图像以及与所述先前输入视觉动态范围图像相关联的先前输入基本层图像的一个或更多个先前参考处理单元数据单元预测；以及以基本层信号、增强层信号和参考处理单元信号输出所述基本层数据、所述增强层数据和所述参考处理单元数据；其中，所述当前参考处理单元数据单元包括指示是否应当重复使用先前发送的参考处理单元数据的帧级语法元素的标志；其中所述参考处理单元数据报头包括指示所述参考处理单元数据涉及的一个或更多个视觉动态范围版本的字段；其中包括在所述参考处理单元数据内的所述编码语法使得能够基于所述基本层数据和所述增强层数据来重构所述输入视觉动态范围图像。According to one embodiment, a video encoding method is provided, comprising: receiving an input visual dynamic range image and an input base layer image associated with the input visual dynamic range image; generating a coding syntax including a plurality of syntax elements at at least a frame level for encoding-related operations; converting the input base layer image and the input visual dynamic range image into base layer data and enhancement layer data according to the coding syntax; wherein the enhancement layer data includes a residual value between the input visual dynamic range image and a predicted visual dynamic range image generated based on the base layer data; wherein the plurality of syntax elements specify one or more of the following operations: chroma resampling operation, inverse mapping operation, prediction operation based on non-overlapping areas, prediction operation based on overlapping areas, residual nonlinear quantization and dequantization operation, residual chroma resampling operation, residual spatial upsampling operation, data processing operation including interpolation, performing these operations to generate the base layer data and the enhancement layer data; encoding the coding syntax elements into reference processing unit data so that a corresponding decoder can use the same coding syntax for decoding-related operations; wherein the reference processing unit The data includes a current reference processing unit data unit; wherein the current reference processing unit data unit includes a reference processing unit data header and a reference processing unit data payload; in the current reference processing unit data unit, at least one of the multiple syntax elements in the coding syntax is indicated as being capable of being predicted based on a previous input visual dynamic range image and one or more previous reference processing unit data units of a previous input base layer image associated with the previous input visual dynamic range image; and the base layer data, the enhancement layer data and the reference processing unit data are output as a base layer signal, an enhancement layer signal and a reference processing unit signal; wherein the current reference processing unit data unit includes a flag indicating whether a frame-level syntax element of the previously sent reference processing unit data should be reused; wherein the reference processing unit data header includes a field indicating one or more visual dynamic range versions to which the reference processing unit data relates; the coding syntax included in the reference processing unit data enables reconstructing the input visual dynamic range image based on the base layer data and the enhancement layer data.

根据另一个实施例，提供一种视频解码方法，包括：以基本层信号、增强层信号和参考处理单元信号接收基本层数据、增强层数据和参考处理单元数据，所述基本层数据、所述增强层数据和所述参考处理单元数据与共用的视觉动态范围源图像相关联；其中所述增强层数据包括所述视觉动态范围源图像与基于所述基本层数据生成的预测视觉动态范围图像之间的残差值；将所述参考处理单元数据解码成包括至少帧级的多个语法元素的编码语法；其中所述多个语法元素指定以下操作中的一个或更多个：色度重采样操作、逆映射操作、基于非重叠区域的预测操作、基于重叠区域的预测操作、残差非线性量化和去量化操作、残差色度重采样操作、残差空间上采样操作、包括插值的数据处理操作，执行这些操作以生成所述基本层数据和所述增强层数据；根据所述编码语法将所述基本层数据和所述增强层数据转换成重构的视觉动态范围图像；其中所述参考处理单元数据包括当前参考处理单元数据单元；其中所述当前参考处理单元数据单元包括参考处理单元数据报头和参考处理单元数据有效载荷；其中所述当前参考处理单元数据单元包括指示是否应当重复使用先前发送的参考处理单元数据的帧级语法元素的标志；其中所述参考处理单元数据报头包括指示所述参考处理单元数据涉及的一个或更多个视觉动态范围版本的字段；其中包括在所述参考处理单元数据内的所述编码语法使得能够基于所述基本层数据和所述增强层数据来确定重构的视觉动态范围图像；根据所述当前参考处理单元数据单元，将所述编码语法的所述多个语法元素中的至少一个语法元素确定为能够根据与先前重构的视觉动态范围图像有关的一个或更多个先前参考处理单元数据单元预测；以及输出所述重构的视觉动态范围图像。According to another embodiment, a video decoding method is provided, comprising: receiving base layer data, enhancement layer data and reference processing unit data in the form of a base layer signal, an enhancement layer signal and a reference processing unit signal, wherein the base layer data, the enhancement layer data and the reference processing unit data are associated with a common visual dynamic range source image; wherein the enhancement layer data comprises a residual value between the visual dynamic range source image and a predicted visual dynamic range image generated based on the base layer data; decoding the reference processing unit data into a coding syntax comprising a plurality of syntax elements at at least a frame level; wherein the plurality of syntax elements specify one or more of the following operations: a chroma resampling operation, an inverse mapping operation, a prediction operation based on a non-overlapping area, a prediction operation based on an overlapping area, a residual nonlinear quantization and dequantization operation, a residual chroma resampling operation, a residual spatial upsampling operation, a data processing operation including interpolation, performing these operations to generate the base layer data and the enhancement layer data; converting the base layer data and the enhancement layer data into a coding syntax comprising a plurality of syntax elements at at least a frame level; wherein the plurality of syntax elements specify one or more of the following operations: a chroma resampling operation, an inverse mapping operation, a prediction operation based on a non-overlapping area, a prediction operation based on an overlapping area, a residual nonlinear quantization and dequantization operation, a residual chroma resampling operation, a residual spatial upsampling operation, a data processing operation including interpolation, performing these operations to generate the base layer data and the enhancement layer data; converting the base layer data and the enhancement layer data into a coding syntax comprising a plurality of syntax elements at at least a frame level; data into a reconstructed visual dynamic range image; wherein the reference processing unit data includes a current reference processing unit data unit; wherein the current reference processing unit data unit includes a reference processing unit data header and a reference processing unit data payload; wherein the current reference processing unit data unit includes a flag indicating whether a frame-level syntax element of a previously sent reference processing unit data should be reused; wherein the reference processing unit data header includes a field indicating one or more visual dynamic range versions to which the reference processing unit data relates; wherein the coding syntax included in the reference processing unit data enables determination of a reconstructed visual dynamic range image based on the base layer data and the enhancement layer data; based on the current reference processing unit data unit, determining at least one syntax element of the multiple syntax elements of the coding syntax as being predictable based on one or more previous reference processing unit data units related to a previously reconstructed visual dynamic range image; and outputting the reconstructed visual dynamic range image.

本发明实施例还包括执行上述编码方法的编码器。The embodiment of the present invention also includes an encoder for executing the above encoding method.

本发明实施例还包括执行上述解码方法的解码器。The embodiment of the present invention also includes a decoder for executing the above decoding method.

本发明实施例还包括执行上述方法的系统。The embodiment of the present invention also includes a system for executing the above method.

根据又一个实施例，一种视频编码系统，包括：被配置成执行以下操作的编码器：接收输入视觉动态范围图像以及与所述输入视觉动态范围图像相关联的输入基本层图像；生成用于与编码有关的操作的包括至少帧级的多个语法元素的编码语法；根据所述编码语法将所述输入视觉动态范围图像和所述输入基本层图像转换成基本层数据和增强层数据；其中所述增强层数据包括所述输入视觉动态范围图像与基于所述基本层数据生成的预测视觉动态范围图像之间的残差值；其中所述多个语法元素指定以下操作中的一个或更多个：色度重采样操作、逆映射操作、基于非重叠区域的预测操作、基于重叠区域的预测操作、残差非线性量化和去量化操作、残差色度重采样操作、残差空间上采样操作、包括插值的数据处理操作，执行这些操作以生成所述基本层数据和所述增强层数据；将所述编码语法编码成参考处理单元数据，以使得相应解码器能够将相同的编码语法用于与解码有关的操作；其中所述参考处理单元数据包括当前参考处理单元数据单元；其中所述当前参考处理单元数据单元包括参考处理单元数据报头和参考处理单元数据有效载荷；其中所述当前参考处理单元数据单元包括指示是否应当重复使用先前发送的参考处理单元数据的帧级语法元素的标志；其中所述参考处理单元数据报头包括指示所述参考处理单元数据涉及的一个或更多个视觉动态范围版本的字段；其中包括在所述参考处理单元数据内的所述编码语法使得能够基于所述基本层数据和所述增强层数据来重构所述输入视觉动态范围图像；在所述当前参考处理单元数据单元中，将所述编码语法中的所述多个语法元素中的至少一个语法元素指示为能够根据先前输入视觉动态范围图像以及与所述先前输入视觉动态范围图像相关联的先前输入基本层图像的一个或更多个先前参考处理单元数据单元预测；以及以基本层信号、增强层信号和参考处理单元信号输出所述基本层数据、所述增强层数据和所述参考处理单元数据。According to another embodiment, a video encoding system comprises: an encoder configured to perform the following operations: receive an input visual dynamic range image and an input base layer image associated with the input visual dynamic range image; generate a coding syntax including multiple syntax elements at at least frame level for encoding-related operations; convert the input visual dynamic range image and the input base layer image into base layer data and enhancement layer data according to the coding syntax; wherein the enhancement layer data includes a residual value between the input visual dynamic range image and a predicted visual dynamic range image generated based on the base layer data; wherein the multiple syntax elements specify one or more of the following operations: chroma resampling operation, inverse mapping operation, prediction operation based on non-overlapping area, prediction operation based on overlapping area, residual nonlinear quantization and dequantization operation, residual chroma resampling operation, residual spatial upsampling operation, data processing operation including interpolation, performing these operations to generate the base layer data and the enhancement layer data; encoding the coding syntax into reference processing unit data so that a corresponding decoder can use the same coding syntax for decoding-related operations; wherein The reference processing unit data includes a current reference processing unit data unit; wherein the current reference processing unit data unit includes a reference processing unit data header and a reference processing unit data payload; wherein the current reference processing unit data unit includes a flag indicating whether a frame-level syntax element of a previously sent reference processing unit data should be reused; wherein the reference processing unit data header includes a field indicating one or more visual dynamic range versions to which the reference processing unit data relates; the coding syntax included in the reference processing unit data enables the input visual dynamic range image to be reconstructed based on the base layer data and the enhancement layer data; in the current reference processing unit data unit, at least one of the multiple syntax elements in the coding syntax is indicated as being capable of being predicted based on a previous input visual dynamic range image and one or more previous reference processing unit data units of a previous input base layer image associated with the previous input visual dynamic range image; and the base layer data, the enhancement layer data and the reference processing unit data are output as a base layer signal, an enhancement layer signal and a reference processing unit signal.

根据再一个实施例，一种视频解码系统，包括：被配置成执行以下操作的解码器：以基本层信号、增强层信号和参考处理单元信号接收基本层数据、增强层数据和参考处理单元数据，所述基本层数据、所述增强层数据和所述参考处理单元数据与共用的视觉动态范围源图像相关联；其中所述增强层数据包括所述视觉动态范围源图像与基于所述基本层数据生成的预测视觉动态范围图像之间的残差值；将所述参考处理单元数据解码成包括至少帧级的多个语法元素的编码语法；其中所述多个语法元素指定以下操作中的一个或更多个：色度重采样操作、逆映射操作、基于非重叠区域的预测操作、基于重叠区域的预测操作、残差非线性量化和去量化操作、残差色度重采样操作、残差空间上采样操作、包括插值的数据处理操作，执行这些操作以生成所述基本层数据和所述增强层数据；根据所述编码语法将所述基本层数据和增强层数据转换成重构的视觉动态范围图像；其中所述参考处理单元数据包括当前参考处理单元数据单元；其中所述当前参考处理单元数据单元包括参考处理单元数据报头和参考处理单元数据有效载荷；其中所述当前参考处理单元数据单元包括指示是否应当重复使用先前发送的参考处理单元数据的帧级语法元素的标志；其中所述参考处理单元数据报头包括指示所述参考处理单元数据涉及的一个或更多个视觉动态范围版本的字段；其中包括在所述参考处理单元数据内的所述编码语法使得能够基于所述基本层数据和所述增强层数据来确定所述重构的视觉动态范围图像；根据所述当前参考处理单元数据单元，将所述编码语法的所述多个语法元素中的至少一个语法元素确定为能够根据与先前重构的视觉动态范围图像有关的一个或更多个先前参考处理单元数据单元预测；以及输出所述重构的视觉动态范围图像。According to another embodiment, a video decoding system comprises: a decoder configured to perform the following operations: receiving base layer data, enhancement layer data and reference processing unit data in the form of a base layer signal, an enhancement layer signal and a reference processing unit signal, wherein the base layer data, the enhancement layer data and the reference processing unit data are associated with a common visual dynamic range source image; wherein the enhancement layer data comprises a residual value between the visual dynamic range source image and a predicted visual dynamic range image generated based on the base layer data; decoding the reference processing unit data into a coding syntax comprising a plurality of syntax elements at at least a frame level; wherein the plurality of syntax elements specify one or more of the following operations: chroma resampling operation, inverse mapping operation, prediction operation based on non-overlapping areas, prediction operation based on overlapping areas, residual nonlinear quantization and dequantization operation, residual chroma resampling operation, residual spatial upsampling operation, data processing operation including interpolation, performing these operations to generate the base layer data and the enhancement layer data; converting the base layer ... the base layer data and the enhancement layer data being converted into a reconstructed visual dynamic range image; wherein the reference processing unit data comprises a current reference processing unit data unit; wherein the current reference processing unit data unit comprises a reference processing unit data header and a reference processing unit data payload; wherein the current reference processing unit data unit comprises a flag indicating whether a frame-level syntax element of a previously sent reference processing unit data should be reused; wherein the reference processing unit data header comprises a field indicating one or more visual dynamic range versions to which the reference processing unit data relates; wherein the coding syntax included in the reference processing unit data enables the reconstructed visual dynamic range image to be determined based on the base layer data and the enhancement layer data; according to the current reference processing unit data unit, at least one syntax element of the multiple syntax elements of the coding syntax is determined as being predictable based on one or more previous reference processing unit data units related to the previously reconstructed visual dynamic range image; and outputting the reconstructed visual dynamic range image.

根据一个实施例，提供一种用于生成高动态范围图像的解码器，其中，解码器包括非暂态存储器以及一个或更多个处理器，其中，使用解码器生成输出图像包括：接收参考处理单元(RPU)数据并且将RPU数据的至少一部分存储在非暂态存储器中；从RPU数据至少提取RPU数据报头和RPU数据有效载荷；接收基本层图像；接收增强层图像；确定所述解码器可否使用来自先前RPU数据有效载荷的语法元素；当确定所述解码器不可使用来自先前RPU数据有效载荷的语法元素时，对RPU数据有效载荷进行解析以提取语法元素，当确定所述解码器可使用来自先前RPU数据有效载荷的语法元素时，对先前RPU Id值进行解码以预测语法元素，其中语法元素包括当前RPU Id值、映射颜色空间指示符、色度重采样滤波器指示符、主元映射参数、以及分区参数；以及基于语法元素将基本层图像和增强层图像组合以生成输出图像。对RPU数据有效载荷进行解析包括：对当前RPU Id值进行解码；对映射颜色空间指示符进行解码；对色度重采样滤波器指示符进行解码；对主元映射参数进行解码；以及对分区参数进行解码。According to one embodiment, a decoder for generating a high dynamic range image is provided, wherein the decoder includes a non-volatile memory and one or more processors, wherein generating an output image using the decoder includes: receiving reference processing unit (RPU) data and storing at least a portion of the RPU data in the non-volatile memory; extracting at least an RPU data header and an RPU data payload from the RPU data; receiving a base layer image; receiving an enhancement layer image; determining whether the decoder can use syntax elements from a previous RPU data payload; when it is determined that the decoder cannot use syntax elements from a previous RPU data payload, parsing the RPU data payload to extract syntax elements, and when it is determined that the decoder can use syntax elements from a previous RPU data payload, decoding a previous RPU Id value to predict syntax elements, wherein the syntax elements include a current RPU Id value, a mapping color space indicator, a chroma resampling filter indicator, a principal component mapping parameter, and a partition parameter; and combining the base layer image and the enhancement layer image based on the syntax elements to generate an output image. Parsing the RPU data payload includes: decoding the current RPU Id value; decoding the mapped color space indicator; decoding the chroma resampling filter indicator; decoding the pivot mapping parameters; and decoding the partition parameters.

根据一个实施例，提供一种用于使用解码器生成高动态范围图像的方法，其中，使用解码器生成输出图像包括：接收参考处理单元(RPU)数据并且将RPU数据的至少一部分存储在非暂态存储器中；从RPU数据至少提取RPU数据报头和RPU数据有效载荷；接收基本层图像；接收增强层图像；确定所述解码器可否使用来自先前RPU数据有效载荷的语法元素；当确定所述解码器不可使用来自先前RPU数据有效载荷的语法元素时，对RPU数据有效载荷进行解析以提取语法元素，当确定所述解码器可使用来自先前RPU数据有效载荷的语法元素时，对先前RPU Id值进行解码以预测语法元素，其中语法元素包括当前RPU Id值、映射颜色空间指示符、色度重采样滤波器指示符、主元映射参数、以及分区参数；以及基于语法元素将基本层图像和增强层图像组合以生成输出图像。对RPU数据有效载荷进行解析包括：对当前RPU Id值进行解码；对映射颜色空间指示符进行解码；对色度重采样滤波器指示符进行解码；对主元映射参数进行解码；以及对分区参数进行解码。According to one embodiment, a method for generating a high dynamic range image using a decoder is provided, wherein generating an output image using the decoder includes: receiving reference processing unit (RPU) data and storing at least a portion of the RPU data in a non-volatile memory; extracting at least an RPU data header and an RPU data payload from the RPU data; receiving a base layer image; receiving an enhancement layer image; determining whether the decoder can use syntax elements from a previous RPU data payload; when it is determined that the decoder cannot use syntax elements from a previous RPU data payload, parsing the RPU data payload to extract syntax elements, and when it is determined that the decoder can use syntax elements from a previous RPU data payload, decoding a previous RPU Id value to predict syntax elements, wherein the syntax elements include a current RPU Id value, a mapping color space indicator, a chroma resampling filter indicator, a principal component mapping parameter, and a partition parameter; and combining the base layer image and the enhancement layer image based on the syntax elements to generate an output image. Parsing the RPU data payload includes: decoding the current RPU Id value; decoding the mapped color space indicator; decoding the chroma resampling filter indicator; decoding the pivot mapping parameters; and decoding the partition parameters.

根据一个实施例，提供一种存储有计算机可执行指令的非暂态计算机可读存储介质，计算机可执行指令用于使用一个或更多个处理器执行根据上述实施例的方法。According to one embodiment, a non-transitory computer-readable storage medium storing computer-executable instructions is provided. The computer-executable instructions are used to execute the method according to the above embodiment using one or more processors.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

通过示例的方式而不是通过限制的方式示出了本发明，在附图的图中，相似的附图标记指代相似的元件，并且其中：The present invention is illustrated by way of example and not by way of limitation in the figures of the accompanying drawings in which like reference numerals refer to like elements, and in which:

图1和图2示出示例实施方式中的基于按照一个或更多个视觉动态范围(VDR)规范的编码语法来生成参考处理单元(RPU)数据的VDR编码器；1 and 2 illustrate a visual dynamic range (VDR) encoder that generates reference processing unit (RPU) data based on a coding syntax in accordance with one or more VDR specifications in an example embodiment;

图3示出示例实施方式中的包括NAL报头和原始字节序列有效载荷的NAL数据单元；FIG3 illustrates a NAL data unit including a NAL header and a raw byte sequence payload in an example embodiment;

图4示出RPU数据报头的布局。图5示出示例实施方式中的RPU数据报头解析；Figure 4 shows the layout of the RPU data header. Figure 5 shows the parsing of the RPU data header in an example embodiment;

图6示出RPU数据有效载荷的布局，图7至图9示出示例实施方式中的RPU数据有效载荷解码；FIG6 illustrates the layout of an RPU data payload, and FIG7 to FIG9 illustrate RPU data payload decoding in an example embodiment;

图10示出示例实施方式中的从RPU数据解析编码语法的VDR解码器；FIG10 illustrates a VDR decoder parsing encoding syntax from RPU data in an example embodiment;

图11A和图11B示出根据本发明的示例实施方式的示例处理流；以及11A and 11B illustrate an example process flow according to an example embodiment of the present invention; and

图12示出根据本发明的实施方式的可以实现本文所描述的计算机或计算设备的示例硬件平台。FIG. 12 illustrates an example hardware platform upon which the computers or computing devices described herein may be implemented, according to an embodiment of the invention.

具体实施方式DETAILED DESCRIPTION

本文描述了涉及使用分层VDR编解码器对视觉动态范围图像进行编码、解码及表示的示例实施方式。在下面的描述中，出于说明的目的，阐述了大量具体细节以便提供对本发明的透彻理解。然而，明显的是，可以在没有这些具体细节的情况下实践本发明。在其他实例中，为了避免不必要地封闭、模糊或混淆本发明，将不再详尽地描述公知的结构和设备。This paper describes an example embodiment that relates to using a layered VDR codec to encode, decode and represent visual dynamic range images. In the following description, for illustrative purposes, a large number of specific details are set forth in order to provide a thorough understanding of the present invention. However, it is apparent that the present invention can be practiced without these specific details. In other examples, in order to avoid unnecessarily closing, blurring or confusing the present invention, known structures and equipment will no longer be described in detail.

在本文中根据下面的提纲描述示例实施方式：Example implementations are described herein according to the following outline:

1.总体概述1. General Overview

2.VDR编码器2. VDR encoder

3.RPU数据单元3.RPU data unit

4.RPU数据解码—序列级和/或帧级4. RPU data decoding—sequence level and/or frame level

5.RPU数据解码—分区级5. RPU data decoding - partition level

6.RPU数据解码—色度映射6. RPU data decoding - chroma mapping

7.RPU数据解码的另外示例7. Additional Examples of RPU Data Decoding

8.示例处理流8. Example Processing Flow

9.实现机制—硬件概述9. Implementation Mechanism - Hardware Overview

10.等同物、扩展、替代及其他事项10. Equivalents, Extensions, Substitutions and Other Matters

1.总体概述1. General Overview

本概述给出本发明的示例实施方式的一些方面的基本描述。应当注意的是，本概述不是示例实施方式的方面的广泛的或详尽的总结。此外，应当注意的是，本概述既不意在被理解为确定示例实施方式的任何特别显著的方面或元素，也不意在被理解为具体地描绘示例实施方式的任何范围、或一般地描述本发明的任何范围。本概述仅给出与扼要和简化形式的示例性实施方式相关的一些概念，并且应当被理解为接下来的示例实施方式的更详细的描述的概念性前序。This overview provides a basic description of some aspects of the example embodiments of the present invention. It should be noted that this overview is not an extensive or exhaustive summary of aspects of the example embodiments. Furthermore, it should be noted that this overview is neither intended to be understood as identifying any particularly significant aspects or elements of the example embodiments, nor is it intended to be understood as specifically delineating any scope of the example embodiments, or generally describing any scope of the present invention. This overview merely provides some concepts related to the example embodiments in a concise and simplified form, and should be understood as a conceptual prelude to the more detailed description of the example embodiments that follows.

本文所描述的技术支持以共用载体将由不同视频编码系统生成的参考处理数据传送并且用信号通知给下游设备。如本文所使用的，术语“共用载体”可以指代被配置成携带基于多种视觉动态范围(VDR)规范中的任意一个规范生成的参考处理数据的共用参考处理单元(RPU)数据格式。如本文所使用的，共用载体中携带的参考处理数据提供包括多个语法元素的编码语法，其中语法元素指定在与参考处理数据相关联的视频数据的编码和解码中所使用的或要使用的操作和参数。本文所描述的语法元素能够描述对于一些或所有不同的VDR规范来说相同的操作或描述仅对不同的VDR规范中的一个或更多个规范而不是所有不同的规范而言特定的操作。可以例如在下游(例如，消费者)设备中实现共用的参考处理数据解码/解析处理以对编码语法或其中的语法元素进行解码，而不管该编码语法涉及哪个VDR规范。因此，下游设备不需要对于消费者设备被配置成或将被配置成支持的每个现有的或新的VDR规范来实现单独且不同的参考处理数据解码/解析处理。下游设备的供应商可能仅需要专注于为用于对按照现有的或新的VDR规范指定的媒体样本进行编码或解码的算法或操作提供支持。因为本文所描述的技术使用了共用载体来传送或用信号通知与根据不同的VDR规范生成的媒体样本相关联的参考处理数据，所以对于包括尚待开发的VDR规范的所有VDR规范可以重复使用或基本上重复使用相同的参考处理数据解码/解析处理。如本文所使用的，术语“媒体样本”指代这样的数据：当与参考处理数据组合时形成本文所描述的VDR数据。The technology described herein supports the transmission of reference processing data generated by different video coding systems using a common carrier and signaling it to downstream devices. As used herein, the term "common carrier" can refer to a common reference processing unit (RPU) data format configured to carry reference processing data generated based on any one of a variety of visual dynamic range (VDR) specifications. As used herein, the reference processing data carried in the common carrier provides a coding syntax including a plurality of syntax elements, wherein the syntax elements specify the operations and parameters used or to be used in the encoding and decoding of video data associated with the reference processing data. The syntax elements described herein can describe operations that are the same for some or all different VDR specifications or describe operations that are specific to only one or more of the different VDR specifications, rather than all different specifications. A common reference processing data decoding/parsing process can be implemented, for example, in a downstream (e.g., consumer) device to decode the coding syntax or the syntax elements therein, regardless of which VDR specification the coding syntax relates to. Therefore, downstream devices do not need to implement separate and different reference processing data decoding/parsing processes for each existing or new VDR specification that the consumer device is configured to or will be configured to support. Suppliers of downstream equipment may only need to focus on providing support for algorithms or operations for encoding or decoding media samples specified in existing or new VDR specifications. Because the technology described herein uses a common carrier to transmit or signal reference processing data associated with media samples generated according to different VDR specifications, the same reference processing data decoding/parsing process can be reused or substantially reused for all VDR specifications, including VDR specifications yet to be developed. As used herein, the term "media sample" refers to data that, when combined with the reference processing data, forms the VDR data described herein.

多个层(或比特流)可以用于将VDR数据(媒体样本和参考处理单元数据)从上游设备如VDR编码器传送至下游设备。多个层中携带的VDR数据可以用于支持范围广泛的显示技术，这些技术可以包括但不仅限于向后兼容性显示技术和新的高动态范围(HDR)显示技术中的任意技术。如本文所使用的，术语“VDR”或“视觉动态范围”可以指代比标准动态范围更宽的动态范围，并且可以包括但不限于高达人类视觉可以瞬时感知的瞬时可感知的动态范围和色域的宽动态范围。如本文所使用的，术语“多层(multi-layer)”或“多个层(multiplelayer)”可以指代两个或更多个比特流，包括携带具有(视频信号)彼此间的一种或更多种逻辑依赖关系的多个视频或图像信号的基本层(BL)、参考处理单元(RPU)层和增强层(EL)。Multiple layers (or bitstreams) can be used to transmit VDR data (media samples and reference processing unit data) from an upstream device such as a VDR encoder to a downstream device. The VDR data carried in multiple layers can be used to support a wide range of display technologies, which may include but are not limited to any of backward compatibility display technologies and new high dynamic range (HDR) display technologies. As used herein, the term "VDR" or "visual dynamic range" may refer to a dynamic range wider than a standard dynamic range, and may include but is not limited to a wide dynamic range up to the instantaneous perceptible dynamic range and color gamut that human vision can instantaneously perceive. As used herein, the term "multi-layer" or "multiple layers" may refer to two or more bitstreams, including a base layer (BL), a reference processing unit (RPU) layer, and an enhancement layer (EL) that carry multiple video or image signals having one or more logical dependencies (video signals) with each other.

基本层可以携带由上游设备从SDR信号获得的或从VDR信号中的输入VDR数据映射的BL数据(基本层媒体样本)。一个或更多个增强层(EL)可以携带由上游设备至少部分地从VDR信号获得的EL数据(增强层媒体样本)。在一些实施方式中，为了利用均与相同的输入VDR数据相关的BL数据与EL数据之间的统计冗余，EL数据可以基于BL视频数据和输入VDR视频数据被(冗余)降低到包括预测值之间的残差值或差值。在一些实施方式中，VDR编码器可以被配置成应用准确预测算法，使得残差值被减小至零；因此，EL数据可以用于保存与准确预测算法相关的减小的层间参考图片集合而不保存无帮助的零残差值。可以不是针对单个VDR图像而是针对一组相关的输入VDR图像来生成无残差视频编码系统的层间参考图片。The base layer may carry BL data (base layer media samples) obtained by an upstream device from an SDR signal or mapped from input VDR data in a VDR signal. One or more enhancement layers (EL) may carry EL data (enhancement layer media samples) obtained at least in part from a VDR signal by an upstream device. In some embodiments, in order to exploit statistical redundancy between BL data and EL data, both of which are related to the same input VDR data, the EL data may be (redundantly) reduced to include residual values or differences between prediction values based on the BL video data and the input VDR video data. In some embodiments, the VDR encoder may be configured to apply an accurate prediction algorithm such that the residual values are reduced to zero; thus, the EL data may be used to save a reduced set of inter-layer reference pictures associated with the accurate prediction algorithm without saving unhelpful zero residual values. Inter-layer reference pictures for a residual-free video coding system may be generated not for a single VDR image, but for a set of related input VDR images.

在一些实施方式中，RPU层可以携带由上游设备生成的参考处理数据(或者表示为RPU数据)。上游设备如VDR编码器可以使用RPU数据将编码语法用信号通知给下游设备如VDR解码器。编码语法使得VDR解码器能够基于BL层和EL层中的BL数据和EL数据来重构VDR图像。In some embodiments, the RPU layer may carry reference processing data (or denoted as RPU data) generated by an upstream device. An upstream device, such as a VDR encoder, may use the RPU data to signal a coding syntax to a downstream device, such as a VDR decoder. The coding syntax enables the VDR decoder to reconstruct a VDR image based on the BL data and EL data in the BL layer and the EL layer.

RPU数据中携带的语法元素的示例可以包括但不限于以下项中的任意项：层间预测系数、残差非线性去量化参数、色度重采样滤波器系数、颜色空间变换指示符或其他VDR语法元素(例如，由VDR编码器执行来生成BL数据和EL数据以及由VDR解码器执行来对BL数据和EL数据进行解码的函数和/或操作的标志和描述符)。RPU数据中的语法元素可以被分类为序列级、帧级、分区级或函数/操作级中之一。Examples of syntax elements carried in RPU data may include, but are not limited to, any of the following: inter-layer prediction coefficients, residual nonlinear dequantization parameters, chroma resampling filter coefficients, color space transform indicators, or other VDR syntax elements (e.g., flags and descriptors of functions and/or operations performed by a VDR encoder to generate BL data and EL data and performed by a VDR decoder to decode BL data and EL data). Syntax elements in RPU data may be categorized as one of a sequence level, a frame level, a partition level, or a function/operation level.

输入VDR媒体内容(例如，HDR影片)可以被细分成序列(例如，对应于场景或场景的一部分等)、帧(或图像或图片)或分区(或图像的部分)。序列级、帧级或分区级处的语法元素可以明确地被编码在当前RPU数据中，或从相应的序列级、帧级或分区级处的先前发送的RPU数据预测。另外地、可选地或可替代地，语法元素可以在序列级、帧级或分区级中多于一级处出现。Input VDR media content (e.g., HDR movie) can be subdivided into sequences (e.g., corresponding to a scene or a portion of a scene, etc.), frames (or images or pictures), or partitions (or portions of an image). Syntax elements at the sequence level, frame level, or partition level can be explicitly encoded in the current RPU data or predicted from previously sent RPU data at the corresponding sequence level, frame level, or partition level. Additionally, optionally, or alternatively, syntax elements can appear at more than one level of the sequence level, frame level, or partition level.

可以通过不同的VDR编码系统实现本文所描述的包括分层编解码器结构(BL层、EL层和RPU层)的技术。例如，实现这些技术中的一些技术的第一VDR编码系统可以是基于残差的分层编解码器，其中基本层和增强层两者使用色度格式(例如，4:2:0)和低位深度(例如，8位)；实现这些技术中的一些技术的第二VDR编码系统可以是基于信号的分层编解码器，其中增强层使用比基本层(4:2:0 8位)的色度格式(例如，4:2:0)和位深度(例如，8位)更宽的色度格式(例如，4:4:4)和较高的位深度(例如，12位或更多)。具体地，本文所描述的RPU数据解码技术可以在最初支持一组一个或更多个不同VDR规范的VDR编码系统中实现，并且可以当随后支持另外的VDR规范时在VDR编码系统中很少变化或没有变化情况下被重复使用。The techniques described herein, including a layered codec structure (BL layer, EL layer, and RPU layer), can be implemented by different VDR encoding systems. For example, a first VDR encoding system implementing some of these techniques can be a residual-based layered codec in which both the base layer and the enhancement layer use a chroma format (e.g., 4:2:0) and a low bit depth (e.g., 8 bits); a second VDR encoding system implementing some of these techniques can be a signal-based layered codec in which the enhancement layer uses a wider chroma format (e.g., 4:4:4) and a higher bit depth (e.g., 12 bits or more) than the chroma format (e.g., 4:2:0) and bit depth (e.g., 8 bits) of the base layer (4:2:0 8 bits). Specifically, the RPU data decoding techniques described herein can be implemented in a VDR encoding system that initially supports a set of one or more different VDR specifications and can be reused with little or no changes in the VDR encoding system when additional VDR specifications are subsequently supported.

RPU层编码的比特流或其中的RPU数据可以与其他层中的编码的比特流，或其中的BL数据和EL数据同步。例如，可以按照显示顺序(例如，如H.264中指定的picture_order_count)通过图片显示编号来对RPU数据和BL/EL数据进行同步。The RPU layer coded bitstream or RPU data therein may be synchronized with the coded bitstreams of other layers, or BL data and EL data therein. For example, the RPU data and BL/EL data may be synchronized by picture display number according to the display order (e.g., as specified in H.264, picture_order_count).

本文所描述的技术支持用于以下项中的多种操作：层间预测、逆映射、色度重采样、数据处理如分区的边界区域中的插值、空间缩放、非线性量化等。一些所支持的操作可能对于一些或所有不同的VDR规范来说是共用的，而一些其他所支持的操作可能对于一个或更多个规范而不是对于所有VDR规范是唯一的。例如，可以对于具有残差值的VDR规范如EL数据执行非线性量化/去量化。The techniques described herein support a variety of operations for the following: inter-layer prediction, inverse mapping, chroma resampling, data processing such as interpolation in the boundary region of a partition, spatial scaling, non-linear quantization, etc. Some of the supported operations may be common to some or all of the different VDR specifications, while some other supported operations may be unique to one or more specifications rather than to all VDR specifications. For example, non-linear quantization/dequantization may be performed for VDR specifications with residual values, such as EL data.

本文所描述的技术支持由灵活的编码语法驱动的视频编码和视频解码。该方法例如使用改进的算法、实现成本、速度等使得编码器设计和解码器设计能够进行并行且持续的优化。利用当前RPU数据与先前发送的RPU数据之间的冗余，可以通过VDR编码器将编码语法连同分层VDR数据有效地发送和用信号通知给VDR解码器。编码语法为VDR解码器提供路线图(例如，完整的)来有效地执行例如逆数据流中的解码操作。The technology described herein supports video encoding and decoding driven by flexible coding syntax. The method enables parallel and continuous optimization of encoder and decoder designs, for example, using improved algorithms, implementation costs, speed, etc. Utilizing the redundancy between current RPU data and previously transmitted RPU data, the coding syntax can be effectively sent and signaled to the VDR decoder along with the layered VDR data via the VDR encoder. The coding syntax provides a roadmap (e.g., complete) for the VDR decoder to effectively perform decoding operations, such as in the inverse data stream.

在一些实施方式中，本文所描述的机制形成媒体处理系统的一部分，媒体处理系统包括但不限于以下项中的任意项：手持设备、游戏机、电视机、笔记本电脑、上网本电脑、平板计算机、蜂窝无线电话、电子书阅读器、销售点终端、台式计算机、计算机工作站、计算机信息站或各种其他类型的终端和媒体处理单元。In some embodiments, the mechanisms described herein form part of a media processing system including, but not limited to, any of the following: a handheld device, a game console, a television, a laptop computer, a netbook computer, a tablet computer, a cellular wireless telephone, an electronic book reader, a point-of-sale terminal, a desktop computer, a computer workstation, a computer kiosk, or various other types of terminals and media processing units.

对于本技术领域的技术人员而言，对本文所描述的优选实施方式以及一般原理和特征的各种修改将是显而易见的。因此，本公开内容并不意在限于所示出的实施方式，而是与符合本文所描述的原理和特征的最宽范围一致。Various modifications to the preferred embodiments and general principles and features described herein will be apparent to those skilled in the art. Therefore, the present disclosure is not intended to be limited to the embodiments shown, but is consistent with the broadest scope consistent with the principles and features described herein.

2.VDR编码器2. VDR encoder

VDR编码器可以使用符合一个或更多个不同VDR规范之一的编码语法来生成BL数据、EL数据和RPU数据。可以使用包括主要和/或次要版本号的不同组合的不同版本来标记或标识这些不同的VDR规范(可以类似地使用标识具体VDR规范的其他方式)。如本文所使用的，VDR规范可以提供可以包括在编码语法中的语法元素的规范，可以从上游设备如VDR编码器向下游设备如VDR解码器用信号通知该规范。A VDR encoder can generate BL data, EL data, and RPU data using a coding syntax that conforms to one of one or more different VDR specifications. These different VDR specifications can be marked or identified using different versions including different combinations of major and/or minor version numbers (other methods of identifying specific VDR specifications can be similarly used). As used herein, a VDR specification can provide specifications for syntax elements that can be included in the coding syntax, and the specification can be signaled from an upstream device, such as a VDR encoder, to a downstream device, such as a VDR decoder.

图1示出了基于符合一个或更多个VDR规范的编码语法来生成RPU数据的VDR编码器102。在一些实施方式中，VDR编码器102支持至少两个VDR规范，例如，分别标记有第一版本(“1.0”)或第二版本(“1.x”)。VDR编码器102可以被配置成根据符合VDR编码器102所支持的一个或更多个VDR规范的编码语法来对BL数据、EL数据、RPU数据、层间预测数据和中间媒体数据执行操作。这些不同的VDR规范可以包括但不限于：支持向后兼容性的第一版本和不支持向后兼容性的第二版本。如本文所使用的，术语“向后兼容性”指代BL数据是否包括被优化用于在SDR显示器上查看的SDR图像。可以使用一种或更多种计算设备来实现VDR编码器102。FIG1 illustrates a VDR encoder 102 that generates RPU data based on a coding syntax that conforms to one or more VDR specifications. In some embodiments, the VDR encoder 102 supports at least two VDR specifications, for example, each labeled with a first version (“1.0”) or a second version (“1.x”). The VDR encoder 102 can be configured to perform operations on BL data, EL data, RPU data, inter-layer prediction data, and intermediate media data according to a coding syntax that conforms to one or more VDR specifications supported by the VDR encoder 102. These different VDR specifications may include, but are not limited to, a first version that supports backward compatibility and a second version that does not support backward compatibility. As used herein, the term “backward compatibility” refers to whether the BL data includes an SDR image that is optimized for viewing on an SDR display. The VDR encoder 102 can be implemented using one or more computing devices.

在一种实施方式中，VDR编码器102被配置成接收(输入)VDR信号104并且从VDR信号104获得输入VDR图像。如本文所使用的，“输入VDR图像”可以包括用于对源图像的VDR版本进行解码的宽或高动态范围图像数据，源图像又可以是由高端图像采集设备捕获的原始图像。输入VDR图像可以是支持高动态范围色域的输入颜色空间中的高位深度(例如，10+位)图像。由本文所描述的VDR编码系统接收或处理的一个或更多个VDR信号的示例包括但不限于以下项中的任意项：12位P3 D65 RGB 444信号、12位推荐(Rec.)709 RGB 444信号、12位DCDM X'Y'Z'444信号、16位TIFF文件格式的视频数据等。In one embodiment, the VDR encoder 102 is configured to receive (input) a VDR signal 104 and obtain an input VDR image from the VDR signal 104. As used herein, an "input VDR image" may include wide or high dynamic range image data used to decode a VDR version of a source image, which may in turn be an original image captured by a high-end image acquisition device. The input VDR image may be a high bit depth (e.g., 10+ bits) image in an input color space that supports a high dynamic range color gamut. Examples of one or more VDR signals received or processed by the VDR encoding system described herein include, but are not limited to, any of the following: a 12-bit P3 D65 RGB 444 signal, a 12-bit Recommended (Rec.) 709 RGB 444 signal, a 12-bit DCDM X'Y'Z'444 signal, video data in a 16-bit TIFF file format, and the like.

在示例中，输入VDR图像中表示的每个像素包括针对颜色空间(例如，RGB颜色空间)定义的所有通道(例如，红色通道、绿色通道和蓝色通道)的像素值。每个像素可以可选地和/或可替代地包括颜色空间中的一个或更多个通道的上采样像素值或下采样像素值。应当注意的是，在一些实施方式中，除了三种基本颜色如红色、绿色和蓝色以外，例如，可以在本文所描述的颜色空间中同时使用不同的基本颜色，以支持宽色域；在这些实施方式中，本文所描述的图像数据包括这些不同基本颜色的另外的像素值，并且可以由本文所描述的技术同时来处理。In an example, each pixel represented in the input VDR image includes pixel values for all channels (e.g., red channel, green channel, and blue channel) defined for a color space (e.g., RGB color space). Each pixel may optionally and/or alternatively include upsampled pixel values or downsampled pixel values for one or more channels in the color space. It should be noted that in some embodiments, in addition to the three basic colors such as red, green, and blue, different basic colors may be used simultaneously in the color space described herein, for example, to support a wide color gamut; in these embodiments, the image data described herein includes additional pixel values for these different basic colors and may be processed simultaneously by the techniques described herein.

在一种实施方式中，VDR编码器102可以在映射颜色空间(YCbCr空间、RGB空间或其他颜色空间中之一)中执行层间预测相关操作。在一些实施方式中，如果输入颜色空间不同于映射颜色空间，则可以通过颜色空间转换单元将输入VDR图像从输入颜色空间转换至映射颜色空间。In one embodiment, the VDR encoder 102 may perform inter-layer prediction related operations in a mapped color space (YCbCr space, RGB space, or one of other color spaces). In some embodiments, if the input color space is different from the mapped color space, the input VDR image may be converted from the input color space to the mapped color space by a color space conversion unit.

在一种实施方式中，如图1所示，VDR编码器102可以被配置成接收(输入)SDR信号108并且从SDR信号108获得BL数据。由本文所描述的VDR编码系统接收的一个或更多个SDR信号的示例包括但不限于以下项中的任意项：8位YCbCr信号、8位YUV文件格式的视频数据等。1 , the VDR encoder 102 may be configured to receive (input) an SDR signal 108 and obtain BL data from the SDR signal 108. Examples of the one or more SDR signals received by the VDR encoding system described herein include, but are not limited to, any of the following: an 8-bit YCbCr signal, video data in an 8-bit YUV file format, and the like.

如本文所使用的，“BL数据”指代可以被优化或不可以被优化以用于在SDR显示器上查看的低位深度(例如，8位)图像数据。如本文所使用的，术语“低位深度”指代在具有低位深度的编码空间中被量化的图像数据；低位深度的示例包括8位，而术语“高位深度”指代在具有高位深度的编码空间中被量化的图像数据；高位深度的示例是10位、12位或更多位。具体地，术语“低位深度”或“高位深度”不指代像素值的最低有效位或最高有效位。As used herein, "BL data" refers to low-bit-depth (e.g., 8-bit) image data that may or may not be optimized for viewing on an SDR display. As used herein, the term "low bit-depth" refers to image data quantized in a coding space with a low bit-depth; examples of low bit-depth include 8 bits, while the term "high bit-depth" refers to image data quantized in a coding space with a high bit-depth; examples of high bit-depth are 10 bits, 12 bits, or more. Specifically, the terms "low bit-depth" or "high bit-depth" do not refer to the least significant bit or the most significant bit of a pixel value.

在第一示例中，BL数据包括被优化用于在SDR显示器上查看的SDR图像，并且可以与支持向后兼容性的第一版本的VDR规范相关联。SDR图像可以包括由着色师进行的使SDR图像在相对窄的或标准的动态范围内看起来尽可能逼真的颜色校正。例如，为了在标准动态范围内创建看起来逼真的图像，可以在SDR图像中改变或校正与产生输入VDR图像的源HDR图像中的一些或所有像素有关的色调信息。In a first example, the BL data includes an SDR image optimized for viewing on an SDR display and may be associated with a first version of the VDR specification that supports backward compatibility. The SDR image may include color corrections performed by a colorist to make the SDR image appear as realistic as possible within a relatively narrow or standard dynamic range. For example, to create an image that appears realistic within a standard dynamic range, tonal information associated with some or all pixels in a source HDR image that produced the input VDR image may be changed or corrected in the SDR image.

在第二示例中，VDR编码器102被配置成将VDR-SDR(例如，色调)映射应用于输入VDR图像以获得BL数据，而不是从类似于图1的108的输入SDR信号获得BL数据。在该示例中，BL数据可以不被优化用于在SDR显示器上查看，并且可以与不支持向后兼容性的第二版本的VDR规范相关联。BL数据可以包括输入VDR图像的低位表示。例如，VDR-SDR映射可以基于以下项中的一项或更多项：全局量化、线性量化、线性拉伸、基于曲线的量化、概率密度函数(Pdf)优化量化、罗伊-最大值量化、基于分区的量化、感知量化、跨颜色通道/矢量量化或其他类型的量化。另外地、可选地、或替代地，例如，VDR-SDR映射可以包括以下处理中的零种或更多种处理：去噪处理、帧对准处理、颜色分级处理等。在该示例中，可以不优化BL数据以用于在标准动态范围内呈现看起来逼真的图像。相反地，BL数据可以要通过下游设备有效地与EL数据组合以构造与从图1的输入VDR信号获得的输入VDR图像对应的输出VDR图像。In a second example, the VDR encoder 102 is configured to apply a VDR-SDR (e.g., tone) mapping to an input VDR image to obtain BL data, rather than obtaining the BL data from an input SDR signal similar to 108 of FIG. 1 . In this example, the BL data may not be optimized for viewing on an SDR display and may be associated with a second version of the VDR specification that does not support backward compatibility. The BL data may include a low-bit representation of the input VDR image. For example, the VDR-SDR mapping may be based on one or more of the following: global quantization, linear quantization, linear stretching, curve-based quantization, probability density function (PDF) optimized quantization, Roy-maximum quantization, partition-based quantization, perceptual quantization, cross-color channel/vector quantization, or other types of quantization. Additionally, optionally, or alternatively, for example, the VDR-SDR mapping may include zero or more of the following processes: denoising, frame alignment, color grading, etc. In this example, the BL data may not be optimized for rendering realistic-looking images within a standard dynamic range. Conversely, the BL data may be effectively combined with the EL data by downstream equipment to construct an output VDR image corresponding to the input VDR image obtained from the input VDR signal of FIG. 1 .

在一种实施方式中，VDR编码器102或其中的基本层SDR编码器(116)被配置成将可以从SDR信号108或者从对从VDR信号104获得的输入VDR图像进行的映射操作获得的输入SDR图像编码成基本层比特流128。In one embodiment, the VDR encoder 102 or the base layer SDR encoder (116) therein is configured to encode an input SDR image that may be obtained from the SDR signal 108 or from a mapping operation performed on an input VDR image obtained from the VDR signal 104 into a base layer bitstream 128.

在一种实施方式中，VDR编码器102采用混合视频编码模型如H.264/MPEG-4AVC(IS14496-10)、HEVC、MPEG-4部分2(IS 14496-2)、MPEG-2(IS 11138-2)、VP8、VC-1和/或其他。可以根据同一图像中的相邻样本(使用帧内预测)或来自属于相同基本层的过去经解码的图像的样本(帧间预测)来对要在基本层中编码的媒体样本进行预测。这些要用于预测的经解码的BL样本可以存储或缓存在(基本层的)参考图片存储114内。In one embodiment, the VDR encoder 102 employs a hybrid video coding model such as H.264/MPEG-4 AVC (IS14496-10), HEVC, MPEG-4 Part 2 (IS14496-2), MPEG-2 (IS11138-2), VP8, VC-1, and/or others. Media samples to be encoded in the base layer may be predicted based on neighboring samples in the same picture (using intra prediction) or samples from previously decoded pictures belonging to the same base layer (inter prediction). These decoded BL samples to be used for prediction may be stored or cached in the reference picture store 114 (of the base layer).

在一种实施方式中，VDR编码器102还被配置成基于经解码的BL样本对要在增强层中编码的媒体样本执行层间预测。可以从参考图片存储114(可以是一种或更多种存储器缓冲区或其他形式的存储器空间)检索经解码的BL样本。In one embodiment, the VDR encoder 102 is further configured to perform inter-layer prediction on media samples to be encoded in the enhancement layer based on the decoded BL samples. The decoded BL samples may be retrieved from a reference picture store 114 (which may be one or more memory buffers or other forms of memory space).

VDR编码器102或其中的VDR RPU 110可以被配置成生成用于由VDR编码器102执行的编码相关操作的编码语法。这些编码相关操作包括被执行来生成要在增强层比特流124中传送的EL数据的操作。根据本文所描述的技术，由VDR编码器将用于编码相关操作的编码语法用信号通知给VDR解码器，使得VDR解码器使用相同的编码语法用于解码相关操作。在一些实施方式中，由VDR编码器用信号通知给VDR解码器的编码语法可以指定由VDR解码器单独执行而非由VDR编码器执行的一种或更多种另外的操作。这些另外的操作包括但不限于显示管理操作中的任意操作。The VDR encoder 102 or the VDR RPU 110 therein can be configured to generate coding syntax for coding-related operations performed by the VDR encoder 102. These coding-related operations include operations performed to generate EL data to be transmitted in the enhancement layer bitstream 124. According to the techniques described herein, the coding syntax for coding-related operations is signaled to the VDR decoder by the VDR encoder so that the VDR decoder uses the same coding syntax for decoding-related operations. In some embodiments, the coding syntax signaled to the VDR decoder by the VDR encoder can specify one or more additional operations that are performed separately by the VDR decoder rather than by the VDR encoder. These additional operations include, but are not limited to, any operation in the display management operation.

这些编码语法可以包括符合VDR编码器102所支持的具体VDR规范的多个语法元素，并且可以包括但不限于以下项中的任意项：层间预测系数、残差非线性去量化参数、色度重采样滤波器系数、颜色空间变换指示符或其他VDR语法元素(例如，由VDR编码器执行以生成BL数据和EL数据以及由VDR解码器执行以对BL数据和EL数据进行解码的函数和/或操作的标志和描述符)。These encoding syntaxes may include multiple syntax elements that conform to the specific VDR specification supported by the VDR encoder 102, and may include, but are not limited to, any of the following: inter-layer prediction coefficients, residual non-linear dequantization parameters, chroma resampling filter coefficients, color space transform indicators, or other VDR syntax elements (e.g., flags and descriptors of functions and/or operations performed by the VDR encoder to generate BL data and EL data and by the VDR decoder to decode the BL data and EL data).

在一种实施方式中，RPU处理模块112被配置成基于编码语法执行一系列序列级、帧级和分区级操作(可以包括但不仅限于与预测相关的那些操作)。例如，RPU处理模块112可以执行如编码语法所指定的操作例如，诸如SDR-VDR映射的逆映射、色度上采样、一种或更多种视频数据处理操作(例如，滤波、插值、重新缩放等)或非线性量化(NLQ)。因此，可以由RPU处理模块112生成预测参考值。In one embodiment, the RPU processing module 112 is configured to perform a series of sequence-level, frame-level, and partition-level operations (which may include, but are not limited to, those related to prediction) based on the coding syntax. For example, the RPU processing module 112 may perform operations specified by the coding syntax, such as, for example, inverse mapping of SDR-VDR mapping, chroma upsampling, one or more video data processing operations (e.g., filtering, interpolation, rescaling, etc.), or non-linear quantization (NLQ). Therefore, the prediction reference value may be generated by the RPU processing module 112.

在一种实施方式中，VDR编码器102可以执行一种或更多种操作来生成从输入VDR图像获得的(如果必要，从VDR信号104和经变换的颜色空间获得的)VDR图像数据与预测参考值之间的残差值(130)。在线性或对数域中残差值可能不同。在一种实施方式中，VDR编码器102或其中的残差下采样/重采样单元可以被配置成对残差值(130)执行一种或更多种下采样/重采样操作以生成经下采样的(例如，8位)残差值以用于进一步处理。在一种实施方式中，VDR编码器102或其中的残差非线性量化器(NLQ；118)可以被配置成对残差值(130)或经下采样的残差值执行一种或更多种非线性量化操作，并且将经非线性量化的残差值提供给VDR编码器102的其他单元以用于进一步处理。In one embodiment, the VDR encoder 102 may perform one or more operations to generate a residual value (130) between VDR image data obtained from an input VDR image (or, if necessary, from the VDR signal 104 and the transformed color space) and a prediction reference value. The residual value may be different in the linear or logarithmic domain. In one embodiment, the VDR encoder 102 or a residual downsampling/resampling unit therein may be configured to perform one or more downsampling/resampling operations on the residual value (130) to generate a downsampled (e.g., 8-bit) residual value for further processing. In one embodiment, the VDR encoder 102 or a residual nonlinear quantizer (NLQ; 118) therein may be configured to perform one or more nonlinear quantization operations on the residual value (130) or the downsampled residual value and provide the nonlinearly quantized residual value to other units of the VDR encoder 102 for further processing.

在一种实施方式中，VDR编码器102或其中的增强层(仅出于说明的目的，8位/4:2:0)编码器(120)被配置成将残差值(在一些实施方式中可以被非线性量化和/或下采样)作为EL数据编码成增强层比特流124。In one embodiment, the VDR encoder 102 or the enhancement layer (8-bit/4:2:0 for illustration purposes only) encoder (120) therein is configured to encode the residual values (which may be non-linearly quantized and/or downsampled in some embodiments) as EL data into the enhancement layer bitstream 124.

在其中VDR编码器102或增强层(仅出于说明的目的，8位/4:2:0)编码器(120)采用混合视频编码器模型的实施方式中，可以根据同一图像中的相邻残差值样本(使用帧内预测)或根据来自属于相同增强层的过去经解码的图像中的残差值样本(帧间预测)来预测残差值。在一种实施方式中，用于预测的相同层EL样本可以被存储或缓存在(增强层的)参考图片存储122内。In embodiments where the VDR encoder 102 or the enhancement layer (8-bit/4:2:0 for illustration purposes only) encoder (120) employs a hybrid video encoder model, the residual value may be predicted from neighboring residual value samples in the same picture (using intra prediction) or from residual value samples from previously decoded pictures belonging to the same enhancement layer (inter prediction). In one embodiment, the same layer EL samples used for prediction may be stored or cached in the reference picture store (for the enhancement layer) 122.

在一种实施方式中，VDR编码器102或其中的VDR RPU 110被配置成将作为RPU数据的一部分的编码语法编码成VDR RPU比特流126。RPU数据可以包括但不限于以下项中的任意项：SDR-VDR映射参数、由应用于生成预测参考图像的预测方法所使用的多项式参数、NLQ参数、由VDR RPU(110)执行的一种或更多种视频数据处理操作所使用的参数。VDR RPU 110可以在RPU数据单元中设置标志或报头字段来指示是否可以根据先前发送的用于先前序列、先前帧或先前分区的RPU数据来预测编码语法中的任意语法元素。In one embodiment, the VDR encoder 102 or the VDR RPU 110 therein is configured to encode coding syntax as part of RPU data into the VDR RPU bitstream 126. The RPU data may include, but is not limited to, any of the following: SDR-VDR mapping parameters, polynomial parameters used by a prediction method applied to generate a predicted reference picture, NLQ parameters, parameters used by one or more video data processing operations performed by the VDR RPU (110). The VDR RPU 110 may set a flag or header field in an RPU data unit to indicate whether any syntax element in the coding syntax can be predicted based on previously sent RPU data for a previous sequence, previous frame, or previous partition.

可以使用多种编解码器如H.264/MPEG-4AVC、HEVC、MPEG-2、VP8、VC-1和/或其他中的一个或更多个来实现BL编码器(116)和EL编码器(120)中之一或两者。One or both of the BL encoder (116) and the EL encoder (120) may be implemented using one or more of a variety of codecs, such as H.264/MPEG-4 AVC, HEVC, MPEG-2, VP8, VC-1, and/or others.

相应的VDR解码器(其实现图1所示的逆数据流并且支持VDR编码器102用来生成编码语法的相同VDR规范)可以用于对由VDR编码器102生成的BL数据流、EL数据流和RPU数据流进行解码，并且生成输入VDR图像的重构版本。A corresponding VDR decoder (which implements the inverse data flow shown in FIG. 1 and supports the same VDR specification used by the VDR encoder 102 to generate the encoding syntax) can be used to decode the BL data stream, EL data stream, and RPU data stream generated by the VDR encoder 102 and generate a reconstructed version of the input VDR image.

图2示出了生成与一个或更多个不同的VDR规范对应的RPU数据的VDR编码器(202)。VDR编码器202可以但不限于与可能不同于由VDR编码器102实现的第一版本或第二版本的第三版本(例如，表示为“2.0”)的VDR规范相关联。可以使用一种或更多种计算设备实现VDR编码器202。2 shows a VDR encoder (202) that generates RPU data corresponding to one or more different VDR specifications. The VDR encoder 202 can be, but is not limited to, associated with a third version (e.g., denoted as "2.0") of the VDR specification that may be different from the first or second version implemented by the VDR encoder 102. The VDR encoder 202 can be implemented using one or more computing devices.

在示例实施方式中，VDR编码器202被配置成接收(输入)VDR信号204并且从VDR信号204获得输入VDR图像。输入VDR图像可以包括支持高动态范围色域的输入颜色空间中的高位深度(例如，10+位)图像数据。In an example embodiment, the VDR encoder 202 is configured to receive (input) a VDR signal 204 and obtain an input VDR image from the VDR signal 204. The input VDR image may include high bit depth (e.g., 10+ bits) image data in an input color space supporting a high dynamic range color gamut.

在一些实施方式中，如果输入颜色空间不同于VDR编码器202在其中执行预测操作的映射颜色空间，则可以通过颜色空间转换单元将输入VDR图像从输入颜色空间变换至映射颜色空间。In some embodiments, if the input color space is different from the mapped color space in which the VDR encoder 202 performs the prediction operation, the input VDR image may be transformed from the input color space to the mapped color space by a color space conversion unit.

在一种实施方式中，如图2所示，VDR编码器202被配置成接收(输入)SDR信号208并且从SDR信号208获得BL数据。或者，VDR编码器202被配置成将VDR-SDR(例如，色调)映射应用于输入VDR图像来获得BL数据，而不是从SDR信号解码BL数据。BL数据可以包括可以被优化或不被优化用于在SDR显示器上查看的低位深度(例如，8位)图像数据。与VDR编码器202相关联的BL数据可以类似于或不类似于如上面所讨论的与VDR编码器102相关联的BL数据。In one embodiment, as shown in FIG2 , the VDR encoder 202 is configured to receive (input) an SDR signal 208 and obtain BL data from the SDR signal 208. Alternatively, the VDR encoder 202 is configured to apply a VDR-SDR (e.g., tone) mapping to the input VDR image to obtain the BL data, rather than decoding the BL data from the SDR signal. The BL data may include low bit depth (e.g., 8-bit) image data that may or may not be optimized for viewing on an SDR display. The BL data associated with the VDR encoder 202 may or may not be similar to the BL data associated with the VDR encoder 102 as discussed above.

在一种实施方式中，VDR编码器202或其中的基本层SDR编码器(216)被配置成将BL数据编码成基本层比特流228，其中该BL数据可以从SDR信号208获得或者根据对从VDR信号204获得的输入VDR图像进行的映射操作获得。In one embodiment, the VDR encoder 202 or the base layer SDR encoder (216) therein is configured to encode BL data into a base layer bitstream 228, wherein the BL data may be obtained from the SDR signal 208 or obtained based on a mapping operation performed on an input VDR image obtained from the VDR signal 204.

在一种实施方式中，可以根据同一图像中的相邻样本(使用帧内预测)或根据来自属于相同的基本层的过去经解码的图像的样本(帧间预测)来预测由BL数据表示的媒体样本。这些样本可以存储或缓存在(基本层的)参考图片存储214内。In one embodiment, the media samples represented by the BL data may be predicted from neighboring samples in the same picture (using intra prediction) or from samples from previously decoded pictures belonging to the same base layer (inter prediction). These samples may be stored or cached in the reference picture store (base layer) 214.

在一种实施方式中，VDR编码器202被配置成基于BL数据样本对与增强层相关的高位深度媒体样本执行层间预测。可以从一种或更多种存储器缓冲区或其他形式的存储器空间中的参考图片存储214中检索BL数据样本。在一些实施方式中，VDR解码器202被设计成基于使用层间参考图片和经解码的BL样本的准确预测算法执行一种或更多种操作来不生成高位深度媒体样本的残差值(或，即使生成，残差值也都为零)。因此，可以至少部分地基于层间参考图片和经解码的BL样本来准确预测高位深度媒体样本。In one embodiment, the VDR encoder 202 is configured to perform inter-layer prediction on high-bit-depth media samples associated with the enhancement layer based on the BL data samples. The BL data samples may be retrieved from a reference picture store 214 in one or more memory buffers or other forms of memory space. In some embodiments, the VDR decoder 202 is designed to perform one or more operations based on an accurate prediction algorithm using inter-layer reference pictures and decoded BL samples to not generate residual values for the high-bit-depth media samples (or, if generated, to zero residual values). Thus, the high-bit-depth media samples can be accurately predicted based at least in part on the inter-layer reference pictures and the decoded BL samples.

VDR编码器202或其中的VDR RPU 210可以被配置成生成用于生成要在增强层比特流224中传送的EL数据(可以包括层间参考图片)的编码语法。编码语法可以包括符合由VDR编码器202支持的VDR规范的多个语法元素，并且可以包括但不限于下列项中任意项：层间预测系数、色度重采样滤波器系数、颜色空间变换指标、其他VDR语法元素(例如，由VDR编码器执行以生成BL数据和EL数据等的函数和/或操作的标志和描述符)等。The VDR encoder 202 or the VDR RPU 210 therein may be configured to generate coding syntax for generating EL data (which may include inter-layer reference pictures) to be transmitted in the enhancement layer bitstream 224. The coding syntax may include a plurality of syntax elements that conform to the VDR specification supported by the VDR encoder 202, and may include, but is not limited to, any of the following: inter-layer prediction coefficients, chroma resampling filter coefficients, color space transform indicators, other VDR syntax elements (e.g., flags and descriptors of functions and/or operations performed by the VDR encoder to generate BL data, EL data, etc.), etc.

在一种实施方式中，RPU处理模块212被配置成基于编码语法执行一系列操作(可以包括但不仅限于与预测相关的那些操作)。例如，RPU处理模块212可以执行如编码语法所指定的操作，例如，诸如SDR-VDR映射的逆色调映射、色度上采样以及一种或更多种视频数据处理操作(例如，滤波、插值、重新缩放等)。在一种实施方式中，RPU处理模块212不生成残差值(或者由于由VDR编码器202或其中的RPU处理模块212执行的准确预测操作而使得所有的残差值为零)。在该实施方式中，由于增强层VDR编码器220处理像素数据，所以为了生成EL数据，RPU处理模块212不执行残差非线性量化(NLQ)。因此，由VDR RPU 210生成的编码语法可以不具有与残差非线性量化(NLQ)相关的参数。In one embodiment, the RPU processing module 212 is configured to perform a series of operations (which may include but are not limited to those related to prediction) based on the coding syntax. For example, the RPU processing module 212 may perform operations as specified by the coding syntax, such as inverse tone mapping such as SDR-VDR mapping, chroma upsampling, and one or more video data processing operations (e.g., filtering, interpolation, rescaling, etc.). In one embodiment, the RPU processing module 212 does not generate residual values (or all residual values are zero due to accurate prediction operations performed by the VDR encoder 202 or the RPU processing module 212 therein). In this embodiment, since the enhancement layer VDR encoder 220 processes pixel data, the RPU processing module 212 does not perform residual nonlinear quantization (NLQ) to generate EL data. Therefore, the coding syntax generated by the VDR RPU 210 may not have parameters related to residual nonlinear quantization (NLQ).

在一种实施方式中，RPU处理模块212被配置成基于编码语法生成层间参考图片。不需要对于每个输入VDR图像生成本文所描述的层间参考图片；可以对于从VDR信号204获得的一个或更多个连续的VDR图像的序列生成层间参考图片。来自层间参考图片的媒体样本可以被存储或缓存在(增强层的)参考图片存储222内。In one embodiment, the RPU processing module 212 is configured to generate inter-layer reference pictures based on the coding syntax. The inter-layer reference pictures described herein need not be generated for each input VDR image; an inter-layer reference picture may be generated for a sequence of one or more consecutive VDR images obtained from the VDR signal 204. Media samples from the inter-layer reference pictures may be stored or cached in the reference picture store 222 (for the enhancement layer).

在一种实施方式中，VDR编码器202或其中的增强层编码器(220)被配置成至少部分基于从VDR信号204获得的层间参考图片和/或输入VDR图像来将输出EL信号编码成增强层比特流224。In one embodiment, the VDR encoder 202 or the enhancement layer encoder (220) therein is configured to encode the output EL signal into an enhancement layer bitstream 224 based at least in part on inter-layer reference pictures obtained from the VDR signal 204 and/or the input VDR image.

在一种实施方式中，VDR编码器202或其中的VDR RPU(210)被配置成将作为RPU数据的至少一部分的编码语法编码成VDR RPU比特流226。In one embodiment, the VDR encoder 202 or the VDR RPU ( 210 ) therein is configured to encode the coding syntax as at least a portion of the RPU data into the VDR RPU bitstream 226 .

可以使用多种编解码器如H.264/MPEG-4AVC、HEVC、MPEG-2、VP8、VC-1和/或其他中的一个或更多个来实现基本层编码器(216)和增强层编码器(220)中之一或两者。One or both of the base layer encoder (216) and the enhancement layer encoder (220) may be implemented using one or more of a variety of codecs such as H.264/MPEG-4 AVC, HEVC, MPEG-2, VP8, VC-1, and/or others.

相应的VDR解码器(其实现图2所示的逆数据流并且支持VDR编码器202用于生成编码语法的相同的VDR规范)可以用于对由第三版本的VDR编码器202生成的比特流进行解码，并且生成输入VDR图像的重构版本。A corresponding VDR decoder (which implements the inverse data flow shown in FIG. 2 and supports the same VDR specification used by the VDR encoder 202 to generate the encoding syntax) can be used to decode the bitstream generated by the third version of the VDR encoder 202 and generate a reconstructed version of the input VDR image.

另外地、可选地或替代地，本文所描述的技术可以支持与VDR规范的其他版本对应的VDR编解码器(或编码系统)。Additionally, optionally, or alternatively, the techniques described herein may support VDR codecs (or encoding systems) corresponding to other versions of the VDR specification.

3.RPU数据单元3.RPU data unit

在一些实施方式中，可以在多个网络抽象层(NAL)数据单元中将由上游设备如VDR编码器(例如，图1的102或图2的202)生成的RPU数据提供给下游设备。在一种实施方式中，如图3所示，NAL数据单元包括NAL报头和原始字节序列有效载荷(RBSP)。仅出于说明的目的，当NAL数据单元中的RBSP用于封装RPU数据时，可以将NAL报头中的字段“NAL-unit-type”设置成25或不同于H.264/MPEG-4AVC规范(IS 14496-10)中指定的那些NAL类型的其他标识号。In some embodiments, RPU data generated by an upstream device, such as a VDR encoder (e.g., 102 of FIG. 1 or 202 of FIG. 2 ), may be provided to a downstream device in a plurality of Network Abstraction Layer (NAL) data units. In one embodiment, as shown in FIG. 3 , a NAL data unit includes a NAL header and a Raw Byte Sequence Payload (RBSP). For illustrative purposes only, when the RBSP in the NAL data unit is used to encapsulate RPU data, the field "NAL-unit-type" in the NAL header may be set to 25 or other identification numbers different from those NAL types specified in the H.264/MPEG-4 AVC specification (IS 14496-10).

在一些实施方式中，NAL数据单元的RBSP中的RPU数据包括RPU数据报头和RPU数据有效载荷。RPU数据单元可以用作共用将RPU数据从上游设备递送至下游设备的共用载体，其中RPU数据可以与多个VDR规范(例如，不同版本)中的任一规范相关联。RPU数据报头可以包括用于标识(例如，3D编码系统或VDR编码器系统的)编解码器或编码系统类型以及多个不同的VDR规范中的具体VDR规范的报头字段。RPU数据报头还可以包括RPU数据单元中所携带的RPU数据的一个或更多个高级(例如，序列级或帧级)部分。In some embodiments, the RPU data in the RBSP of the NAL data unit includes an RPU data header and an RPU data payload. The RPU data unit can be used as a common carrier for delivering RPU data from an upstream device to a downstream device, where the RPU data can be associated with any of a plurality of VDR specifications (e.g., different versions). The RPU data header may include header fields for identifying a codec or encoding system type (e.g., a 3D encoding system or a VDR encoder system) and a specific VDR specification among a plurality of different VDR specifications. The RPU data header may also include one or more high-level (e.g., sequence level or frame level) portions of the RPU data carried in the RPU data unit.

RPU数据有效载荷可以用于由上游设备将可以用于对多层视频信号进行解码和用于使用经解码的视频信号来重构VDR图像的标志集合的描述符(或语法描述)、操作和参数发送至下游设备。由RPU数据有效载荷所描述的用于重构VDR图像的一个或更多个标志、操作和参数可以与层间预测有关。用于本文所描述的层间预测的标志、操作和参数可以与逆映射、色度上采样和其他功能如显示管理中的一项或更多项有关。另外地、可选地或替代地，由RPU数据有效载荷所描述的用于重构VDR图像的一个或更多个标志、操作和参数可以与层间预测附属的，或甚至除了层间预测之外的数据处理有关。The RPU data payload can be used by an upstream device to send a descriptor (or syntactic description) of a set of flags, operations, and parameters that can be used to decode a multi-layer video signal and to reconstruct a VDR image using the decoded video signal to a downstream device. One or more flags, operations, and parameters described by the RPU data payload for reconstructing the VDR image can be related to inter-layer prediction. The flags, operations, and parameters used for inter-layer prediction described herein can be related to one or more of inverse mapping, chroma upsampling, and other functions such as display management. Additionally, optionally, or alternatively, one or more flags, operations, and parameters described by the RPU data payload for reconstructing the VDR image can be related to data processing attached to, or even in addition to, inter-layer prediction.

图4示出了示例实施方式中的RPU数据报头的布局。在一个实施方式中，RPU数据报头包括多个报头字段。仅出于说明的目的，报头字段可以包括但不限于下列项中的任意项：“rpu_type”、“rpu_format”、“vdr_rpu_profile”、“vdr_rpu_level”、“vdr序列级信息”、“vdr帧级信息”等。FIG4 illustrates the layout of an RPU data header in an example embodiment. In one embodiment, the RPU data header includes multiple header fields. For illustrative purposes only, the header fields may include, but are not limited to, any of the following: "rpu_type," "rpu_format," "vdr_rpu_profile," "vdr_rpu_level," "vdr sequence level information," "vdr frame level information," and the like.

报头字段“rpu_type”可以用于标识RPU数据是否与3D编解码器(例如，当rpu_type＝0或1)或VDR编解码器(例如，当rpu_type＝2)有关。报头字段“rpu_type”可以用于适应尚待开发的另外的新视频编解码器。报头字段“rpu_format”可以用于标识RPU数据与其有关的一个或更多个VDR版本。仅出于说明的目的，报头字段“rpu_format”的最高有效位可以用于区分VDR编解码器的主要区别，而同一字段的最低有效位可以用于区分VDR编解码器的次要变化。例如，当报头字段“rpu_format”的最高有效位(例如，顶部3个)为0时，则RPU数据与VDR版本1.x流程有关；另一方面，当报头字段“rpu_format”的最高有效位(例如，顶部3个)为1时，则RPU数据与VDR版本2.0流程有关。The header field "rpu_type" can be used to identify whether the RPU data is related to a 3D codec (e.g., when rpu_type = 0 or 1) or a VDR codec (e.g., when rpu_type = 2). The header field "rpu_type" can be used to accommodate additional new video codecs yet to be developed. The header field "rpu_format" can be used to identify one or more VDR versions to which the RPU data is related. For illustrative purposes only, the most significant bits of the header field "rpu_format" can be used to distinguish major differences in VDR codecs, while the least significant bits of the same field can be used to distinguish minor changes in VDR codecs. For example, when the most significant bits (e.g., the top 3) of the header field "rpu_format" are 0, the RPU data is related to the VDR version 1.x process; on the other hand, when the most significant bits (e.g., the top 3) of the header field "rpu_format" are 1, the RPU data is related to the VDR version 2.0 process.

本文所描述的VDR编码系统可以支持一种或更多种不同的RPU属性。报头字段“vdr_rpu_profile”可以用于标识RPU数据与其有关的属性。例如，报头字段的值0指示指定映射颜色空间YCbCr、映射色度格式4:2:0、多项式映射方法和全局唯一映射分区的基线属性；报头字段的值1指示指定所有映射颜色空间、所有映射色度格式、所有映射方法和全局唯一映射分区的主属性；而报头字段的值2指示指定所有映射颜色空间、所有映射色度格式、所有映射方法和局部可以映射分区(全局分区或局部分区)的高属性。在一些实施方式中，可以保留待由正在开发或尚待开发的新属性使用的报头字段“rpu_profile”的其他可能值。报头字段“rpu_level”可以另外地和/或可选地用于进一步地区分使用RPU数据执行的RPU处理的复杂度的等级。The VDR encoding system described herein may support one or more different RPU attributes. The header field "vdr_rpu_profile" may be used to identify the attributes to which the RPU data is associated. For example, a header field value of 0 indicates a baseline attribute specifying the mapped color space YCbCr, the mapped chroma format 4:2:0, the polynomial mapping method, and the globally unique mapped partition; a header field value of 1 indicates a primary attribute specifying all mapped color spaces, all mapped chroma formats, all mapping methods, and the globally unique mapped partition; and a header field value of 2 indicates a high attribute specifying all mapped color spaces, all mapped chroma formats, all mapping methods, and a locally mappable partition (global partition or local partition). In some embodiments, other possible values of the header field "rpu_profile" may be reserved for use by new attributes being developed or yet to be developed. The header field "rpu_level" may be used additionally and/or optionally to further distinguish the level of complexity of the RPU processing performed using the RPU data.

根据本文所描述的技术，包括符合VDR规范的一个或更多个语法元素的编码语法可以由VDR编码器以RPU比特流发送至/用信号通知给VDR解码器。语法元素可以指定在VDR编码操作和相应的VDR解码操作中使用的标志、操作和参数。语法元素中表示的参数可以具有不同的系数类型，并且可以被指定为具有不同精度、不同长度或字长等的逻辑值、整数(固定点)值或浮点值。According to the techniques described herein, a coding syntax comprising one or more syntax elements conforming to the VDR specification may be sent/signaled by a VDR encoder to a VDR decoder in an RPU bitstream. Syntax elements may specify flags, operations, and parameters used in a VDR encoding operation and a corresponding VDR decoding operation. Parameters represented in syntax elements may have different coefficient types and may be specified as logical values, integer (fixed point) values, or floating point values with different precisions, lengths, or word lengths, etc.

编码语法中的一些语法元素可以被分类为对于完整序列的连续图像保持不变的序列级信息。尽管应当注意相同的语法元素可以用作序列级或各种编码语法中的不同等级，但是序列级信息的示例包括但不限于以下项语法元素中的任意项：“chroma_sample_loc_type”、“vdr_color_primaries”、“vdr_chroma_format_idc”等。如图4所示，序列级信息被置于报头字段“vdr序列级信息”中，其可以是复杂的字段并且进而包括标志vdr_seq_info_present_flag以指示是否直接使用一个或更多个当前RPU数据单元对任何具体的序列级信息进行编码或者是否根据先前的RPU数据来预测序列级信息。Some syntax elements in the coding syntax can be classified as sequence-level information that remains unchanged for consecutive pictures in the entire sequence. Although it should be noted that the same syntax element can be used as a sequence-level or different levels in various coding syntaxes, examples of sequence-level information include, but are not limited to, any of the following syntax elements: "chroma_sample_loc_type", "vdr_color_primaries", "vdr_chroma_format_idc", etc. As shown in Figure 4, the sequence-level information is placed in the header field "vdr sequence level information", which can be a complex field and further includes a flag vdr_seq_info_present_flag to indicate whether any specific sequence-level information is encoded directly using one or more current RPU data units or whether the sequence-level information is predicted based on previous RPU data.

在一些实施方式中，出于发送效率的原因，对于每个图像(在本说明书的上下文中，其可以被互换地表示为帧)，可以不通过VDR编码器将序列级信息发送至VDR解码器。而是，可以对于每序列的连续帧发送一次序列级参数。然而，出于随机存取、纠错和鲁棒性的原因，本发明的实施方式不排除在同一序列内一次、两次等重复序列级参数。在一个示例中，在包括100个连续图像的序列中，在10帧、25帧、50帧等的切片后，可以在序列内重复序列级参数。在另一示例中，可以对于每个瞬时解码刷新(IDR)图片、每两个IDR图片等在序列内重复序列级参数。In some embodiments, for reasons of transmission efficiency, sequence-level information may not be sent by the VDR encoder to the VDR decoder for each picture (which may be interchangeably referred to as a frame in the context of this specification). Instead, the sequence-level parameters may be sent once for each consecutive frame of the sequence. However, for reasons of random access, error correction, and robustness, embodiments of the present invention do not exclude repeating the sequence-level parameters once, twice, etc. within the same sequence. In one example, in a sequence comprising 100 consecutive pictures, the sequence-level parameters may be repeated within the sequence after slices of 10 frames, 25 frames, 50 frames, etc. In another example, the sequence-level parameters may be repeated within the sequence for each instantaneous decoding refresh (IDR) picture, every two IDR pictures, etc.

编码语法中的一些语法元素可以被分类为对于全帧保持不变的帧级信息。在一些实施方式中，如图4所示，帧级信息被置于报头字段“vdr帧级信息”中。在一些实施方式中，可以根据在先的RPU数据单元中发送的帧级语法元素来预测一些或所有帧级信息。Some syntax elements in the coding syntax can be classified as frame-level information that remains constant for the entire frame. In some embodiments, as shown in FIG4 , the frame-level information is placed in the header field "vdr frame-level information". In some embodiments, some or all of the frame-level information can be predicted based on the frame-level syntax elements sent in the previous RPU data unit.

例如，层间预测系数对于图片组(GOP)、场景、一序列帧等可以相同或相似。因此，没有必要对于每个帧重复帧级信息。对于具有相同RPU标识符(Id)的一个或更多个当前RPU数据单元，VDR编码器可以将RPU数据单元的RPU数据字段“vdr_rpu_id”中的RPU ID(或标识符)用信号通知给VDR解码器，以指示在当前RPU数据单元中直接对帧级信息进行编码(因此，在不参照由不同RPU ID标识的先前RPU数据的情况下可以直接对帧级信息进行检索)。For example, the inter-layer prediction coefficients may be the same or similar for a group of pictures (GOP), a scene, a sequence of frames, etc. Therefore, it is not necessary to repeat the frame-level information for each frame. For one or more current RPU data units having the same RPU identifier (Id), the VDR encoder may signal the RPU ID (or identifier) in the RPU data field "vdr_rpu_id" of the RPU data unit to the VDR decoder to indicate that the frame-level information is to be encoded directly in the current RPU data unit (therefore, the frame-level information can be directly retrieved without referring to previous RPU data identified by a different RPU ID).

在一些实施方式中，可以将一个或更多个当前RPU数据单元中的标志“use_prev_vdr_rpu_flag”设置成向VDR解码器指示应当重新使用或使用一个或更多个先前发送的RPU数据单元中的帧级信息用于预测与一个或更多个当前RPU数据单元有关的帧级信息。可以在一个或更多个当前RPU数据单元中的RPU数据字段“prev_vdr_rpu_id”中标识先前发送的RPU数据单元。因此，可以避免在一个或更多个当前RPU数据单元中发送可预测的帧级语法元素。在一些实施方式中，由于当前RPU数据单元不直接携带经编码的帧级语法元素，因此还可以避免为当前RPU数据单元分配RPU ID。最大数量的RPU ID以及他们相应的帧级语法元素可以用于取决于比特流传输中的数据量减小与VDR解码器处的内存使用率增大之间的成本收益权衡的预测。In some embodiments, a flag "use_prev_vdr_rpu_flag" can be set in one or more current RPU data units to indicate to the VDR decoder that frame-level information in one or more previously transmitted RPU data units should be reused or used to predict frame-level information related to the one or more current RPU data units. The previously transmitted RPU data unit can be identified in the RPU data field "prev_vdr_rpu_id" in one or more current RPU data units. Thus, the transmission of predictable frame-level syntax elements in one or more current RPU data units can be avoided. In some embodiments, since the current RPU data unit does not directly carry coded frame-level syntax elements, the allocation of an RPU ID for the current RPU data unit can also be avoided. A maximum number of RPU IDs and their corresponding frame-level syntax elements can be used for prediction based on the cost-benefit trade-off between reduced data volume in bitstream transmission and increased memory usage at the VDR decoder.

在一些实施方式中，本文所描述的技术支持将图像分成一个或更多个部分。可以用于指定编码语法的一些语法元素可以被分类为帧级语法元素，而一些其他语法元素可以被分类为分区级语法元素。In some embodiments, the techniques described herein support partitioning an image into one or more parts.Some syntax elements that may be used to specify coding syntax may be categorized as frame-level syntax elements, while some other syntax elements may be categorized as partition-level syntax elements.

图5示出了可以用于对来自RPU数据单元的序列级和/或帧级语法元素进行解码(或解析)的RPU解码(或解析)处理。RPU解码/解析处理可以被配置成接收一个或更多个当前RPU数据单元，并且从一个或更多个当前RPU数据单元的一个或更多个RPU数据报头获得至少一些语法元素。首先，RPU解码/解析处理可以确定是否存在标志“use_prev_vdr_rpu_flag”，以及如果存在则确定该标志(语法元素)的值是多少。FIG5 illustrates an RPU decoding (or parsing) process that may be used to decode (or parse) sequence-level and/or frame-level syntax elements from an RPU data unit. The RPU decoding/parsing process may be configured to receive one or more current RPU data units and obtain at least some syntax elements from one or more RPU data headers of the one or more current RPU data units. First, the RPU decoding/parsing process may determine whether a flag "use_prev_vdr_rpu_flag" is present and, if so, what the value of the flag (syntax element) is.

如果确定该标志被设置为1(或“是”)，则RPU解码/解析处理继续进行以从所接收的RPU数据单元解码或解析出语法元素“prev_vdr_rpu_id”，其指向与一个或更多个先前发送的RPU数据单元中的先前发送的语法元素相关联的预测器(或先前的)RPU ID。基于预测器RPU ID，RPU解码/解析处理可以使用预测器RPU ID作为键从指定的RPU数据缓存中检索先前所发送的语法元素。该“use_prev_vdr_flag”被设置成1(或“是”)的处理路径可以用于根据来自RPU数据缓存的相同等级的先前发送的语法元素来预测编码语法的序列级、帧级和分区级的语法元素中的一些或全部。If it is determined that the flag is set to 1 (or "yes"), the RPU decoding/parsing process proceeds to decode or parse the syntax element "prev_vdr_rpu_id" from the received RPU data unit, which points to the predictor (or previous) RPU ID associated with the previously transmitted syntax elements in one or more previously transmitted RPU data units. Based on the predictor RPU ID, the RPU decoding/parsing process can use the predictor RPU ID as a key to retrieve the previously transmitted syntax elements from the specified RPU data cache. The processing path with the "use_prev_vdr_flag" set to 1 (or "yes") can be used to predict some or all of the sequence-level, frame-level, and partition-level syntax elements of the coding syntax based on the previously transmitted syntax elements of the same level from the RPU data cache.

另一方面，如果确定标志“use_prev_vdr_rpu_flag”为0(或“否”)，则RPU解码/解析处理进行以从所接收的RPU数据单元解码或解析出语法元素“vdr_rpu_id”，其被设置成分配给在一个或更多个当前RPU数据单元中直接编码的语法元素的当前RPU ID。仅出于说明的目的，这些语法元素可以包括但不限于：mapping_color_space”、“mapping_chroma_idc”、“chroma_resampling_filter_idc”、“num_pivots_minus2”、“pred_pivot_value[][]”、“nlq_method_idx”、“nlq_num_pivots_minus2”、“nlq_pred_pivot_value[][]”、“enable_residual_spatial_upsampling_flag”、“num_x_partition_minusl”、“num_y_partition_minusl”，“residual_resampling_filter_idc”、“overlapped_prediction_method”等。应当注意的是，在各种实施方式中，不同的语法元素和/或语法元素的不同名称可以被限定或用于实现本文所描述的技术。On the other hand, if it is determined that the flag "use_prev_vdr_rpu_flag" is 0 (or "No"), the RPU decoding/parsing process proceeds to decode or parse the syntax element "vdr_rpu_id" from the received RPU data unit, which is set to the current RPU ID assigned to the syntax element directly encoded in one or more current RPU data units. For purposes of illustration only, these syntax elements may include, but are not limited to, mapping_color_space, mapping_chroma_idc, chroma_resampling_filter_idc, num_pivots_minus2, pred_pivot_value[][], nlq_method_idx, nlq_num_pivots_minus2, nlq_pred_pivot_value[][], enable_residual_spatial_upsampling_flag, num_x_partition_minus1, num_y_partition_minus1, residual_resampling_filter_idc, overlapped_prediction_method, and the like. It should be noted that in various embodiments, different syntax elements and/or different names of syntax elements may be defined or used to implement the techniques described herein.

这些语法元素中的一个或更多个可以是指示存在或不存在某些相应操作的标志。例如，标志“use_prev_vdr_rpu_flag”指示存在或不存在根据先前的RPU ID的所缓存的语法元素预测RPU数据的操作。类似地，标志“enable_residual_spatial_upsampling_flag”可以指示在基于所接收的BL数据和EL数据重构VDR图像时是否应当执行残差重采样滤波操作。指示符“chroma_resampling_filter_idc”可以指示在基于所接收的BL数据和EL数据重构VDR图像时应当使用哪个色度重采样滤波器。在RPU解码/解析处理本身中，这些标志中的每个也可以用于确定是否应当采取特定的处理路径。One or more of these syntax elements may be flags indicating the presence or absence of certain corresponding operations. For example, the flag "use_prev_vdr_rpu_flag" indicates the presence or absence of an operation to predict RPU data based on the cached syntax elements of the previous RPU ID. Similarly, the flag "enable_residual_spatial_upsampling_flag" may indicate whether a residual resampling filtering operation should be performed when reconstructing a VDR image based on the received BL data and EL data. The indicator "chroma_resampling_filter_idc" may indicate which chroma resampling filter should be used when reconstructing a VDR image based on the received BL data and EL data. In the RPU decoding/parsing process itself, each of these flags may also be used to determine whether a specific processing path should be taken.

RPU类型和版本信息(其可以指示例如RPU数据是否与对应的VDR规范的v1.x流程相关联和/或是否VDR规范实现无残差EL数据)可以用于确定图5所示的RPU解码/解析处理中的一些处理路径。RPU type and version information (which may indicate, for example, whether the RPU data is associated with the v1.x flow of the corresponding VDR specification and/or whether the VDR specification implements no residual EL data) may be used to determine some processing paths in the RPU decoding/parsing process shown in FIG. 5 .

诸如语法元素“num_x_partition_minus 1”和“num_y_partition_minus 1”的参数还可以用于确定RPU解码/解析处理中的一些处理路径。例如，如果两个语法元素都具有零值，其指示全局唯一分区，则可以采取与全局唯一分区对应的处理路径。另一方面，如图5所示，如果这些语法元素中的任意一个或两者具有非零值，则可以采取不同的处理路径。Parameters such as the syntax elements "num_x_partition_minus 1" and "num_y_partition_minus 1" can also be used to determine some processing paths in the RPU decoding/parsing process. For example, if both syntax elements have a zero value, indicating a globally unique partition, then a processing path corresponding to the globally unique partition can be taken. On the other hand, as shown in FIG5 , if either or both of these syntax elements have a non-zero value, then a different processing path can be taken.

5.RPU数据解码—分区级5. RPU data decoding - partition level

在一些实施方式中，在一个或更多个当前RPU数据单元(例如，一个或更多个RPU有效载荷)中，分区级语法元素可以由VDR编码器发送至VDR解码器。图6示出了示例性实施方式中的可以用于从一个或更多个当前RPU数据单元解码分区级语法元素的RPU数据有效载荷解码/解析处理。一个或更多个分区级语法元素可以与编码语法有关，或可以用于编码语法中来指定层间预测相关操作和/或其他处理操作。In some embodiments, partition-level syntax elements may be sent from a VDR encoder to a VDR decoder in one or more current RPU data units (e.g., one or more RPU payloads). FIG6 illustrates an RPU data payload decoding/parsing process that may be used to decode partition-level syntax elements from one or more current RPU data units in an exemplary embodiment. One or more partition-level syntax elements may be associated with coding syntax or may be used in coding syntax to specify inter-layer prediction related operations and/or other processing operations.

在一些实施方式中，图6的RPU数据有效载荷解码/解析处理可以实现为函数“vdr_rpu_data_payload()”，例如，其可以被图5的RPU解码/解析处理调用。In some embodiments, the RPU data payload decoding/parsing process of FIG. 6 may be implemented as a function “vdr_rpu_data_payload()”, which may be called by the RPU decoding/parsing process of FIG. 5 , for example.

在一些实施方式中，对于沿图像帧的x方向和y方向迭代的每个分区重复图6的若干步骤。如图6所示，首先对于每个分区可以调用函数“rpu_data_mapping(X，Y)”来解码对于多个不同VDR规范共用的分区级语法元素。随后，可以解码对于特定VDR规范而言更具体的语法元素。可以基于其他语法元素或例如已经从一个或更多个当前RPU数据单元解码的RPU信息来对更具体的语法元素进行解码。例如，基于(1)“rpu_format”字段和(2)从一个或更多个当前RPU数据单元的一个或更多个RPU数据报头解码的(版本号v1.x的VDR规范的)语法元素“mapping_chroma_idc”或(版本号v2.x的VDR规范的)语法元素“vdr_chroma_format_idc”和“sdr_chroma_format_idc”，图6的RPU数据有效载荷解码/解析处理可以确定是否应对所接收的符合版本“v1.x”或“v2.x”的VDR规范的BL数据和EL数据执行色度重采样操作。In some embodiments, several steps of FIG. 6 are repeated for each partition iterated along the x- and y-directions of the image frame. As shown in FIG. 6 , the function "rpu_data_mapping(X, Y)" may first be called for each partition to decode partition-level syntax elements common to multiple different VDR specifications. Subsequently, syntax elements more specific to a particular VDR specification may be decoded. The more specific syntax elements may be decoded based on other syntax elements or, for example, RPU information already decoded from one or more current RPU data units. For example, based on (1) the “rpu_format” field and (2) the syntax element “mapping_chroma_idc” (of the VDR specification of version number v1.x) or the syntax elements “vdr_chroma_format_idc” and “sdr_chroma_format_idc” (of the VDR specification of version number v2.x) decoded from one or more RPU data headers of one or more current RPU data units, the RPU data payload decoding/parsing process of Figure 6 can determine whether a chroma resampling operation should be performed on the received BL data and EL data that conform to the VDR specification of version “v1.x” or “v2.x”.

如果确定VDR规范为版本v1.x并且标志“disable_residual_flag”为假，则可以对于每个分区调用RPU解码/解析函数“rpu_data_nlq(x,y)”。此外，如图6所示，当x分区索引和y分区索引两者为零时，则还可以调用其他解码/解析函数如帧级RPU解码/解析函数“rpu_data_residual_resampling(x,y)”。If the VDR specification is determined to be version v1.x and the flag "disable_residual_flag" is false, the RPU decoding/parsing function "rpu_data_nlq(x,y)" may be called for each partition. In addition, as shown in FIG6 , when both the x-partition index and the y-partition index are zero, other decoding/parsing functions such as the frame-level RPU decoding/parsing function "rpu_data_residual_resampling(x,y)" may also be called.

在一些实施方式中，由VDR编码系统实现的VDR规范支持色度重采样、逆映射、基于预测的操作，基于预测的操作包括但不限于：基于重叠区域的预测、残差非线性量化/去量化、残差色度重采样、空间缩放、数据处理(例如，分区的边界区域的插值)等。In some embodiments, the VDR specification implemented by the VDR encoding system supports chroma resampling, inverse mapping, prediction-based operations, including but not limited to: prediction based on overlapping areas, residual nonlinear quantization/dequantization, residual chroma resampling, spatial scaling, data processing (e.g., interpolation of partitioned boundary areas), etc.

对于色度重采样，VDR规范可以支持固定滤波器和显式1D(2D可分离)滤波器和2D(非分离)滤波器两者，或使用其他亮度或色度通道信息的滤波器(跨通道重采样滤波器)等。语法元素“chroma_resampling_filter_idc”可以用于指定哪个上述滤波器作为编码语法的一部分。根据本文所描述的技术，不同的色度通道可以使用不同的滤波器。另外地、可选地或替代地，显式滤波器可以是对称的或非对称的。在一些实施方式中，本文所描述的一种或更多种滤波器可以被设计成将图片边界(图像边界)当作滤波操作中的特殊情况。例如，滤波器可以通过重复(如图6所示)或镜像来简单地填补图片边界。在一些实施方式中，本文所描述的一种或更多种滤波器被设计成跨不同分区执行操作，或以相同的方式将分区边界当作图片边界。不同的色度重采样滤波器可以用于同一图像的不同分区。另外地、可选地或替代地，滤波器可以应用于完整图像的所有分区。另外地、可选地或替代地，可以为完整图像指定特定类型的滤波器例如显式的滤波器；然而，可以在不同分区的编码语法中指定不同的系数。例如，可以在帧级用信号通知语法元素“chroma_resampling_filter_idc”来指示对于整个帧使用了特定类型的滤波器；然而，可以在分区级用信号通知帧内的不同分区的不同滤波器系数。另外地、可选地或替代地，可以直接对分区级滤波器系数进行编码，根据当前RPU ID从一个或更多个先前的分区预测分区级滤波器系数，或者根据先前RPUID从一个或更多个RPU数据单元中的一个或更多个分区中预测分区级滤波器系数。可以对本文所描述的系数进行非差分编码或差分编码(例如，包括关于不同分区、图像或色度通道的值的差分值)。色度重采样滤波器也应当考虑色度采样位置。For chroma resampling, the VDR specification may support both fixed filters and explicit 1D (2D separable) and 2D (non-separable) filters, or filters that use additional luminance or chroma channel information (cross-channel resampling filters). The syntax element "chroma_resampling_filter_idc" may be used to specify which of these filters is used as part of the encoding syntax. According to the techniques described herein, different chroma channels may use different filters. Additionally, optionally, or alternatively, the explicit filters may be symmetric or asymmetric. In some embodiments, one or more of the filters described herein may be designed to treat picture boundaries (image boundaries) as special cases in the filtering operation. For example, a filter may be simply padded to a picture boundary by repeating (as shown in FIG. 6 ) or mirroring. In some embodiments, one or more of the filters described herein may be designed to operate across different partitions, or to treat partition boundaries as picture boundaries in the same manner. Different chroma resampling filters may be used for different partitions of the same image. Additionally, optionally, or alternatively, a filter may be applied to all partitions of the complete image. Additionally, optionally, or alternatively, a particular type of filter, such as an explicit filter, may be specified for the entire image; however, different coefficients may be specified in the coding syntax for different partitions. For example, the syntax element "chroma_resampling_filter_idc" may be signaled at the frame level to indicate that a particular type of filter is used for the entire frame; however, different filter coefficients for different partitions within a frame may be signaled at the partition level. Additionally, optionally, or alternatively, the partition-level filter coefficients may be encoded directly, predicted from one or more previous partitions based on the current RPU ID, or predicted from one or more partitions in one or more RPU data units based on a previous RPUID. The coefficients described herein may be non-differentially encoded or differentially encoded (e.g., including differential values for values for different partitions, images, or chroma channels). The chroma resampling filter should also take the chroma sampling position into account.

对VDR分层编解码器而言，逆映射可以起重要的作用。本文所描述的VDR规范可以支持各种逆映射方法。逆映射方法的示例包括但不仅限于以下中的任意项：位移(bit-shift)、多项式、MMR、SOP、1D LUT、曲线拟合等。图8所示的解码/解析函数“rpu_data_mapping()”可以用于对与所指定的颜色空间中的每个颜色分量(亮度分量或色度分量)的逆映射有关的语法元素进行解码(或解析)。语法元素“syntax mapping_idc”可以用于指示选择了哪个逆映射方法。由于图像的不同区域可以包含不同的视觉内容，因此VDR规范可以允许不同分区(例如，不同区域)使用不同的映射方法。图像的每个通道的动态范围可以被划分成不同段(或块)并且每个动态范围段可以使用不同的映射方法。此外，每个不同分区中的每个动态范围可以使用不同的映射方法。这种方法可以用于逆映射，在逆映射中图像的媒体内容的中间动态范围是线性的并且可以使用线性映射进行处理，而暗亮范围是非线性的并且应当使用相对复杂的映射方法进行处理。在一种实施方式中，语法pivot_value用来指示动态范围段。另外地、可选地或替代地，映射主元值可以被差分编码(例如，由语法元素“pred_pivot_value”指示的)，或者以用信号通知给下游VDR解码器的一个或更多个当前RPU数据单元的编码语法直接编码。For VDR layered codecs, inverse mapping can play an important role. The VDR specification described herein can support various inverse mapping methods. Examples of inverse mapping methods include, but are not limited to, any of the following: bit-shift, polynomial, MMR, SOP, 1D LUT, curve fitting, etc. The decoding/parsing function "rpu_data_mapping()" shown in Figure 8 can be used to decode (or parse) syntax elements related to the inverse mapping of each color component (luminance component or chrominance component) in the specified color space. The syntax element "syntax mapping_idc" can be used to indicate which inverse mapping method has been selected. Since different areas of an image can contain different visual content, the VDR specification can allow different partitions (e.g., different areas) to use different mapping methods. The dynamic range of each channel of an image can be divided into different segments (or blocks) and each dynamic range segment can use a different mapping method. In addition, each dynamic range in each different partition can use a different mapping method. This approach can be used for inverse mapping, where the intermediate dynamic range of the media content of the image is linear and can be processed using a linear mapping, while the dark and bright ranges are nonlinear and should be processed using a relatively complex mapping approach. In one embodiment, the syntax pivot_value is used to indicate the dynamic range segment. Additionally, optionally, or alternatively, the mapping pivot value can be differentially encoded (e.g., as indicated by the syntax element "pred_pivot_value") or directly encoded in a coding syntax that is signaled to a downstream VDR decoder for one or more current RPU data units.

在一些实施方式中，图像的多个分区中的至少一个可以使用多个不同的动态范围段。在一些实施方式中，图像的所有分区中的不同动态范围段的数量的最大者被确定或被设置为低于界限。在一些实施方式中，尽管不同分区中的动态范围可能可选地不同，但是图像的所有分区保持相同的数量的不同动态范围段。In some embodiments, at least one of the plurality of partitions of the image may utilize a plurality of different dynamic range segments. In some embodiments, the maximum number of different dynamic range segments across all partitions of the image is determined or set below a limit. In some embodiments, while the dynamic ranges in different partitions may optionally differ, all partitions of the image maintain the same number of different dynamic range segments.

对于线性映射，可以由VDR编码器使用编码语法中的一个或更多个语法元素来将多项式系数用信号通知下游VDR解码器。或者，编码语法中的一个或更多个语法元素可以用于用信号通知经映射的主元值用于在每个动态范围段中内插像素。可以使用语法元素(或标志)“linear_interp_flag”来用信号通知映射主元值的存在。在一个示例中，可以用信号通知1D LUT中数据点的一些值或全部值。在另一示例中，可以使用被用信号通知给下游VDR解码器的映射的主元值基于插值来创建1DLUT中的至少一些值。For linear mapping, the polynomial coefficients may be signaled to a downstream VDR decoder by the VDR encoder using one or more syntax elements in the encoding syntax. Alternatively, one or more syntax elements in the encoding syntax may be used to signal mapped pivot values for interpolating pixels in each dynamic range segment. The presence of the mapped pivot values may be signaled using a syntax element (or flag) "linear_interp_flag." In one example, some or all values of the data points in the 1D LUT may be signaled. In another example, at least some values in the 1D LUT may be created based on interpolation using the mapped pivot values signaled to the downstream VDR decoder.

用于分区的动态范围映射(例如，色调映射)的系数可以被直接编码，或替代地根据从相同的RPU数据单元获得的相邻分区中的映射动态范围段对其进行预测。另外地、可选地或替代地，可以根据从先前的RPU数据单元获得的映射块的分区中的映射动态范围段来预测分区的系数。例如，可以根据从先前的RPU数据单元获得的相同分区中的映射动态范围段来预测分区的系数。The coefficients for a dynamic range map (e.g., tone map) for a partition may be encoded directly or, alternatively, predicted based on a mapped dynamic range segment in an adjacent partition obtained from the same RPU data unit. Additionally, alternatively, or alternatively, the coefficients for a partition may be predicted based on a mapped dynamic range segment in a partition of a mapped block obtained from a previous RPU data unit. For example, the coefficients for a partition may be predicted based on a mapped dynamic range segment in the same partition obtained from a previous RPU data unit.

本文所描述的技术支持使用不同于编码颜色空间(可以以序列级信息例如以RPU数据报头用信号通知)的映射颜色空间(例如，由语法元素“mapping_color_space”指示)。例如，编码颜色空间可以是YCbCr，而映射颜色空间可以是RGB。其他类型的颜色空间可以用作编码空间或映射空间的选择。对于不同的分区，映射颜色空间可以不同。或者，映射颜色空间可以对于所有的分区相同。对于映射颜色空间的不同通道，映射方法和元数据可以不同。或者，映射方法和元数据可以对于映射颜色空间的全部通道相同。在图像中使用多个分区的实施方式中，有可能存在沿分区边界的不连续性。在一种实施方式中，编码语法可以用于用信号通知边界映射方法，要通过以下方式执行该边界映射方法：通过对像素值或颜色值的基于加权的平均和/或通过使用线性或非线性方法融合分区边界来简单地平滑分区边界。在一种实施方式中，编码语法中的语法元素“overlapped_prediction_method”可以至少部分地用于用信号通知边界映射方法。The techniques described herein support the use of a mapping color space (e.g., indicated by the syntax element "mapping_color_space") that is different from the encoding color space (which can be signaled in sequence-level information, such as in the RPU data header). For example, the encoding color space can be YCbCr, while the mapping color space can be RGB. Other types of color spaces can be used as a choice of encoding space or mapping space. The mapping color space can be different for different partitions. Alternatively, the mapping color space can be the same for all partitions. The mapping method and metadata can be different for different channels of the mapping color space. Alternatively, the mapping method and metadata can be the same for all channels of the mapping color space. In embodiments where multiple partitions are used in an image, there may be discontinuities along partition boundaries. In one embodiment, the encoding syntax can be used to signal a boundary mapping method, which is performed by simply smoothing partition boundaries by weighted averaging of pixel values or color values and/or by blending partition boundaries using linear or nonlinear methods. In one embodiment, the syntax element "overlapped_prediction_method" in the encoding syntax can be used, at least in part, to signal the boundary mapping method.

6.RPU数据解码—色度映射6. RPU data decoding - chroma mapping

图7示出了一个示例实施方式中的可以用于对与色度重采样有关的语法元素进行解码(或解析)的RPU数据解码(或解析)操作(例如，以rpu_data_chroma_resampling()函数的形式)。这些语法元素可以但不仅要求处于分区级。RPU数据编码操作可以被实现为例如可以由图6的解码/解析处理调用的解码/解析函数。在对于整个图像来说映射颜色空间相同的实施方式中，图7所示的语法元素可以可选地在编码语法中呈现为帧级语法元素。FIG7 illustrates an RPU data decoding (or parsing) operation (e.g., in the form of an rpu_data_chroma_resampling() function) that may be used to decode (or parse) syntax elements related to chroma resampling in an example embodiment. These syntax elements may, but are not necessarily, present at the partition level. The RPU data encoding operation may be implemented as a decode/parse function that may be called, for example, by the decode/parse process of FIG6 . In embodiments where the mapped color space is the same for the entire image, the syntax elements shown in FIG7 may optionally be present as frame-level syntax elements in the coding syntax.

在一些实施方式中，在分段映射操作中涉及的多个段可以对于所有分区保持相同。可以将与插值有关的一些语法元素在编码语法中呈现为帧级语法元素，而将与插值有关的其他一些语法元素在编码语法中呈现为分区级语法元素。In some embodiments, the multiple segments involved in the segment mapping operation may remain the same for all partitions.Some syntax elements related to interpolation may be presented in the coding syntax as frame-level syntax elements, while other syntax elements related to interpolation may be presented in the coding syntax as partition-level syntax elements.

如图7所示，“rpu_data_chroma_resampling()”解码/解析函数可以对颜色空间中的多个颜色分量进行解码。针对每个颜色分量，可以重复若干步骤。As shown in Figure 7, the "rpu_data_chroma_resampling()" decoding/parsing function can decode multiple color components in the color space. For each color component, several steps can be repeated.

如果颜色分量的标志指示使用先前分区滤波系数，则解码/解析函数“rpu_data_chroma_resampling()”继续进行以获得颜色分量的预测器分区信息。预测器分区信息可以包括从先前的RPU ID的缓存的语法元素获得的分区滤波系数，或者从来自一个或更多个当前RPU数据单元的一个或更多个其他分区的已经解码的语法元素获得的分区滤波系数。If the flag of the color component indicates to use the previous partition filter coefficients, the decoding/parsing function "rpu_data_chroma_resampling()" proceeds to obtain the predictor partition information of the color component. The predictor partition information may include partition filter coefficients obtained from the cached syntax elements of the previous RPU ID, or partition filter coefficients obtained from already decoded syntax elements of one or more other partitions of one or more current RPU data units.

另一方面，如果颜色分量的标志指示没有使用先前分区滤波器系数，则解码/解析函数“rpu_data_chroma_resampling()”继续进行以从一个或更多个当前RPU数据单元获得分区滤波器系数。这些系数可以与2D显式滤波器、1D垂直显式滤波器、1D水平显式滤波器等有关。On the other hand, if the flag for the color component indicates that the previous partition filter coefficients are not used, the decoding/parsing function "rpu_data_chroma_resampling()" proceeds to obtain the partition filter coefficients from one or more current RPU data units. These coefficients can be related to 2D explicit filters, 1D vertical explicit filters, 1D horizontal explicit filters, etc.

如同动态范围映射中使用的系数，可以替代地根据从相同的一个或更多个当前RPU数据单元获得的相邻分区中的相似系数来预测分区的色度重采样或色度映射中的系数。另外地、可选地或替代地，根据从一个或更多个先前发送的RPU数据单元获得的映射块的分区中的相似系数可以预测分区的系数。例如，根据从一个或更多个先前所发送的RPU数据单元获得的相同分区中的相似系数可以预测分区的系数。As with the coefficients used in dynamic range mapping, coefficients in chroma resampling or chroma mapping for a partition may alternatively be predicted based on similar coefficients in adjacent partitions obtained from the same one or more current RPU data units. Additionally, alternatively, or alternatively, coefficients for a partition may be predicted based on similar coefficients in partitions of a mapping block obtained from one or more previously transmitted RPU data units. For example, coefficients for a partition may be predicted based on similar coefficients in the same partition obtained from one or more previously transmitted RPU data units.

7.RPU数据解码的另外示例7. Additional Examples of RPU Data Decoding

图9示出了示例实施方式中的对与分区级中非线性量化/去量化有关的语法元素进行解码(或解析)的RPU数据解码(或解析)操作(例如，以rpu_data_nlq()函数的形式)。具体的VDR规范可以支持非线性量化/去量化。非线性量化/去量化的示例可以包括但不仅限于下列项中的任意项：那些基于线性盲区、μ律曲线、拉普拉斯算子曲线、S形曲线等的非线性量化/去量化。可以在语法元素“nlq_method_idc”中将非线性量化/去量化的具体方法用信号通知给下游VDR解码器。在一种实施方式中，对于图像的所有分区可以使用相同的方法(例如，语法元素“nlq_method_idc”可以作为RPU数据报头中的帧级别信息的一部分用信号通知)；然而，对于不同的分区，方法的系数可以相同或可以不同。非线性量化/去量化中涉及的数据范围可以被划分成多段；不同段可以具有相同方法的不同系数。FIG9 illustrates RPU data decoding (or parsing) operations (e.g., in the form of the rpu_data_nlq() function) for decoding (or parsing) syntax elements related to nonlinear quantization/dequantization at the partition level in an example embodiment. Specific VDR specifications may support nonlinear quantization/dequantization. Examples of nonlinear quantization/dequantization may include, but are not limited to, those based on linear blind spots, μ-law curves, Laplacian curves, S-shaped curves, and the like. The specific method of nonlinear quantization/dequantization may be signaled to the downstream VDR decoder in the syntax element "nlq_method_idc." In one embodiment, the same method may be used for all partitions of a picture (e.g., the syntax element "nlq_method_idc" may be signaled as part of the frame-level information in the RPU data header); however, the coefficients of the method may be the same or different for different partitions. The data range involved in nonlinear quantization/dequantization may be divided into multiple segments; different segments may have different coefficients for the same method.

如同其他操作中所使用的其他系数，分区的非线性量化/去量化中的系数可以直接被编码，或者根据从相同RPU数据单元获得的相邻分区中的相同系数对其进行预测。另外地、可选地或替代地，可以根据从先前RPU数据单元获得的映射块的分区中的相似系数来预测分区的系数。例如，可以根据从先前RPU数据单元获得的相同分区中的相似系数来预测分区的系数。Like other coefficients used in other operations, the coefficients in the nonlinear quantization/dequantization of a partition can be directly encoded or predicted based on the same coefficients in adjacent partitions obtained from the same RPU data unit. Additionally, optionally, or alternatively, the coefficients of the partition can be predicted based on similar coefficients in a partition of a mapped block obtained from a previous RPU data unit. For example, the coefficients of the partition can be predicted based on similar coefficients in the same partition obtained from a previous RPU data unit.

在一些实施方式中，符合具体VDR规范的编码语法可以指定要对残差数据执行的色度重采样和/或空间上采样(例如，1:2)。在一些实施方式中，以与上面所讨论的色度重采样滤波器有关的操作的处理方式相类似的方式在编码语法中处理对残差数据执行的操作。In some embodiments, the encoding syntax conforming to a particular VDR specification may specify chroma resampling and/or spatial upsampling (e.g., 1:2) to be performed on the residual data. In some embodiments, the operations performed on the residual data are handled in the encoding syntax in a manner similar to the operations related to the chroma resampling filter discussed above.

在一些实施方式中，不同的色度格式用于以BL信号和EL信号编码的图像数据。例如，BL信号可以使用与EL信号所使用的不同的色度格式、不同的色度采样和不同的位深度。另外地、可选地或替代地，BL信号和EL信号可以使用不同的颜色空间。In some embodiments, different chroma formats are used for image data encoded in the BL signal and the EL signal. For example, the BL signal may use a different chroma format, different chroma sampling, and different bit depth than that used by the EL signal. Additionally, optionally, or alternatively, the BL signal and the EL signal may use different color spaces.

本文所描述的技术支持色度重采样、颜色空间变换和逆映射中的不同处理顺序。在一些实施方式中，VDR编码系统可以支持多个可能的处理顺序中的一个、两个或更多个。由VDR编码系统支持的一个或多于一个处理顺序可以被视为最佳。例如，在VDR编码器(例如，图2的202)的输出比特流(例如，图2的BL比特流228和EL比特流224)中的BL数据和EL数据两者的编码颜色空间是由VDR规范指定的YCbCr；映射颜色空间可以是RGB；输入SDR信号(例如，图2的208)是YCbCr 4：2：0；输入VDR信号(例如，图2的204)是RGB 4:4:4 12位。在该示例中，可以如下生成层间参考数据。首先，VDR编码器对从输入SDR信号获得的BL数据执行从4:2:0到4:4:4的色度重采样。接下来，可以对BL数据(现在以4:4:4的色度格式)执行从YCbCr到RGB的颜色变换。可以在映射颜色空间中对BL数据(现在以4:4:4的色度格式和与映射颜色空间中相同的映射颜色空间)执行逆映射来在映射颜色空间中生成层间预测值。出于生成EL数据的目的，可以对层间预测值执行从RGB到YCbCr的颜色变换。The techniques described herein support different processing orders in chroma resampling, color space conversion, and inverse mapping. In some embodiments, the VDR encoding system may support one, two, or more of a plurality of possible processing orders. One or more processing orders supported by the VDR encoding system may be considered optimal. For example, the encoding color space of both the BL data and the EL data in the output bitstream (e.g., BL bitstream 228 and EL bitstream 224 of FIG. 2 ) of the VDR encoder (e.g., 202 of FIG. 2 ) is YCbCr as specified by the VDR specification; the mapping color space may be RGB; the input SDR signal (e.g., 208 of FIG. 2 ) is YCbCr 4:2:0; the input VDR signal (e.g., 204 of FIG. 2 ) is RGB 4:4:4 12-bit. In this example, inter-layer reference data may be generated as follows. First, the VDR encoder performs chroma resampling from 4:2:0 to 4:4:4 on the BL data obtained from the input SDR signal. Next, a color transform from YCbCr to RGB can be performed on the BL data (now in a 4:4:4 chroma format). An inverse mapping can be performed on the BL data (now in a 4:4:4 chroma format and in the same mapped color space as in the mapped color space) in the mapped color space to generate inter-layer prediction values in the mapped color space. For the purpose of generating EL data, a color transform from RGB to YCbCr can be performed on the inter-layer prediction values.

8.示例处理流8. Example Processing Flow

图10示出了示例实施方式中的根据RPU数据对编码语法进行解码的VDR解码器。编码语法可以符合特定的VDR规范，其可以是例如图1的VDR编码器102所支持的第一版本(“1.0”)或第二版本(“1.x”)。VDR解码器可以被配置成根据编码语法对BL数据、EL数据、RPU数据、层间预测数据和中间媒体数据执行解码操作。可以使用一种或更多种计算装置、自定义和/或现成的硬件设备、可编程器件、上述项的任意组合等来实现图10的VDR解码器。FIG10 illustrates a VDR decoder that decodes encoding syntax based on RPU data in an example embodiment. The encoding syntax may conform to a specific VDR specification, which may be, for example, version 1 (“1.0”) or version 2 (“1.x”) supported by the VDR encoder 102 of FIG1 . The VDR decoder may be configured to perform decoding operations on BL data, EL data, RPU data, inter-layer prediction data, and intermediate media data based on the encoding syntax. The VDR decoder of FIG10 may be implemented using one or more computing devices, custom and/or off-the-shelf hardware devices, programmable devices, any combination of the foregoing, and the like.

在一些实施方式中，图10的VDR解码器可以实现图5至图9所示的一种或更多种解码/解析处理以获得编码语法以及其中的语法元素。图10的VDR解码器可以将解码操作应用于BL数据、EL数据和RPU数据以构造与例如由VDR编码器(例如，图1的102)编码的输入VDR图像相对应的输出VDR图像。In some embodiments, the VDR decoder of FIG10 may implement one or more of the decoding/parsing processes shown in FIG5 to FIG9 to obtain the encoding syntax and the syntax elements therein. The VDR decoder of FIG10 may apply decoding operations to the BL data, EL data, and RPU data to construct an output VDR image corresponding to an input VDR image encoded, for example, by a VDR encoder (e.g., 102 of FIG1 ).

图11A示出了根据本发明的示例实施方式的示例处理流。在一些示例实施方式中，一个或更多个计算装置或部件可以执行该处理流。在框1102中，多层VDR视频编码器(例如，图1的102或图2的202)接收输入视觉动态范围(VDR)图像以及与该输入VDR图像相关联的输入基本层(BL)图像。FIG11A illustrates an example process flow according to an example embodiment of the present invention. In some example embodiments, one or more computing devices or components may perform the process flow. In block 1102, a multi-layer VDR video encoder (e.g., 102 of FIG1 or 202 of FIG2 ) receives an input visual dynamic range (VDR) image and an input base layer (BL) image associated with the input VDR image.

在框1104中，多层VDR视频编码器生成包括序列级、帧级或分区级的多个语法元素的编码语法。In block 1104 , the multi-layer VDR video encoder generates a coding syntax comprising a plurality of syntax elements at a sequence level, a frame level, or a partition level.

在框1106中，多层VDR视频编码器根据编码语法将输入BL图像和输入VDR图像转换成BL数据和增强层(EL)数据。In block 1106 , the multi-layer VDR video encoder converts the input BL image and the input VDR image into BL data and enhancement layer (EL) data according to a coding syntax.

在框1108中，多层VDR视频编码器将编码语法转换成参考处理单元(RPU)数据。In block 1108, the multi-layer VDR video encoder converts the coding syntax into reference processing unit (RPU) data.

在框1110中，多层VDR视频编码器以BL信号、EL信号、RPU信号输出BL数据、EL数据和RPU数据。In block 1110 , the multi-layer VDR video encoder outputs BL data, EL data, and RPU data as a BL signal, an EL signal, and an RPU signal.

在一种实施方式中，多层VDR视频编码器还被配置成执行：至少部分地基于编码语法生成一个或更多个当前RPU数据单元；以及在一个或更多个当前RPU数据单元中确定编码语法符合的具体VDR规范。In one embodiment, the multi-layer VDR video encoder is further configured to: generate one or more current RPU data units based at least in part on the encoding syntax; and determine a specific VDR specification to which the encoding syntax complies in the one or more current RPU data units.

在一种实施方式中，一个或更多个当前RPU数据单元中的至少一个包括能够支持多个不同的VDR规范中的任意一个的数据结构。In one embodiment, at least one of the one or more current RPU data units includes a data structure capable of supporting any one of a plurality of different VDR specifications.

在一种实施方式中，多层VDR视频编码器还被配置成执行：在一个或更多个当前RPU数据单元中指示根据一个或更多个当前RPU数据单元中的一个或更多个其他分区可预测的编码语法中的多个语法元素中的至少一个语法元素。In one embodiment, the multi-layer VDR video encoder is further configured to perform: indicating, in one or more current RPU data units, at least one syntax element of a plurality of syntax elements in a coding syntax that is predictable from one or more other partitions in the one or more current RPU data units.

在一种实施方式中，多层VDR视频编码器还被配置成执行：在一个或更多个当前RPU数据单元中指示根据先前输入VDR图像和与先前输入VDR图像相关联的先前输入BL图像的一个或更多个先前RPU数据单元可预测的编码语法中的多个语法元素中的至少一个语法元素。In one embodiment, the multi-layer VDR video encoder is further configured to perform: indicating, in one or more current RPU data units, at least one syntax element of a plurality of syntax elements in a coding syntax that is predictable based on a previous input VDR image and one or more previous input BL images associated with the previous input VDR image.

在一种实施方式中，输入VDR图像和先前输入VDR图像属于输入VDR图像序列；该输入VDR图像序列共享序列级的语法元素的共用集合。In one embodiment, the input VDR picture and the previous input VDR picture belong to a sequence of input VDR pictures; the sequence of input VDR pictures shares a common set of syntax elements at sequence level.

在一种实施方式中，输入VDR图像和先前输入VDR图像属于两个不同的输入VDR图像序列；两个不同的输入VDR图像序列中的第一序列共享序列级的语法元素的第一共用集合；两个不同的输入VDR图像序列中的第二序列共享序列级的语法元素的不同的第二共用集合。In one embodiment, the input VDR image and the previous input VDR image belong to two different input VDR image sequences; a first sequence of the two different input VDR image sequences shares a first common set of sequence-level grammatical elements; and a second sequence of the two different input VDR image sequences shares a different second common set of sequence-level grammatical elements.

在一种实施方式中，多个语法元素中的至少一个语法元素可用作序列级、帧级或分区级中的两个或更多个的语法元素。In one embodiment, at least one syntax element of the plurality of syntax elements may be used as a syntax element at two or more of the sequence level, the frame level, or the partition level.

在一种实施方式中，BL数据表示被优化用于在SDR显示器上查看的标准动态范围(SDR)图像。在一种实施方式中，BL数据不表示被优化用于在SDR显示器上查看的标准动态范围(SDR)图像。In one embodiment, the BL data represents a standard dynamic range (SDR) image optimized for viewing on an SDR display. In one embodiment, the BL data does not represent a standard dynamic range (SDR) image optimized for viewing on an SDR display.

在一种实施方式中，EL数据包括输入VDR图像与基于BL数据生成的预测VDR图像之间的残差值。在一种实施方式中，EL数据包括输入VDR图像序列中的两个或更多个输入VDR图像的层间参考图片；两个或更多个输入VDR图像包括输入VDR图像。In one embodiment, the EL data comprises a residual value between an input VDR image and a predicted VDR image generated based on the BL data. In one embodiment, the EL data comprises an inter-layer reference picture of two or more input VDR images in a sequence of input VDR images; the two or more input VDR images comprise the input VDR image.

在一种实施方式中，多个语法元素包括一个或更多个参数、系数、主元值、指示存在或不存在与标志对应的操作的标志、或包括显示管理元数据的一个或更多个类型的元数据。In one embodiment, the plurality of syntax elements include one or more parameters, coefficients, pivot values, flags indicating the presence or absence of an operation corresponding to the flag, or one or more types of metadata including display management metadata.

在一种实施方式中，输入VDR图像包括在输入颜色空间中编码的图像数据；EL数据包括在输出颜色空间中编码的图像数据；至少部分地基于映射数据生成EL数据；至少部分地基于BL数据生成映射数据；并且映射数据包括在映射颜色空间中编码的映射图像数据。In one embodiment, the input VDR image includes image data encoded in an input color space; the EL data includes image data encoded in an output color space; the EL data is generated at least in part based on the mapping data; the mapping data is generated at least in part based on the BL data; and the mapping data includes mapped image data encoded in the mapped color space.

在一种实施方式中，输入颜色空间、输出颜色空间、和映射颜色空间中的至少两种不同。在一种实施方式中，输入颜色空间、输出颜色空间、和映射颜色空间中的至少两种相同。In one embodiment, at least two of the input color space, the output color space, and the mapped color space are different. In one embodiment, at least two of the input color space, the output color space, and the mapped color space are the same.

在一种实施方式中，EL数据包括以第一色度格式编码的图像数据，BL数据包括以不同的第二色度格式编码的图像数据。在一种实施方式中，EL数据包括以色度格式编码的图像数据；BL数据包括以相同色度格式编码的图像数据。In one embodiment, the EL data includes image data encoded in a first chroma format and the BL data includes image data encoded in a second, different chroma format. In one embodiment, the EL data includes image data encoded in a chroma format and the BL data includes image data encoded in the same chroma format.

在一种实施方式中，多个语法元素用信号通知下列操作中的一个或更多个：色度重采样操作、逆映射操作、基于非重叠区域的预测操作、基于重叠区域的预测操作、残差非线性量化和去量化操作、残差色度重采样操作、空间缩放操作、包括插值的数据处理操作、或显示管理操作。In one embodiment, the plurality of syntax elements signal one or more of the following operations: a chroma resampling operation, an inverse mapping operation, a non-overlapping region based prediction operation, an overlapping region based prediction operation, a residual non-linear quantization and dequantization operation, a residual chroma resampling operation, a spatial scaling operation, a data processing operation including interpolation, or a display management operation.

在一种实施方式中，多层VDR视频编码器还被配置成执行：将使用一个或更多个输入视频信号表示、接收、发送或存储的一个或更多个输入VDR图像转换成使用一个或更多个输出视频信号表示、接收、发送或存储的一个或更多个输出VDR图像In one embodiment, the multi-layer VDR video encoder is further configured to perform: converting one or more input VDR images represented, received, transmitted or stored using one or more input video signals into one or more output VDR images represented, received, transmitted or stored using one or more output video signals.

在一种实施方式中，输入VDR图像包括以下列项之一编码的图像数据：高动态范围(HDR)图像格式、与电影艺术和科学研究院(AMPAS)的学院颜色编码规范(ACES)标准相关联的RGB颜色空间、数字影院倡导联盟的P3颜色空间标准、参考输入媒体度量/参考输出媒介度量(RIMM/ROMM)标准、sRGB颜色空间、RGB颜色空间或YCbCr颜色空间。In one embodiment, the input VDR image includes image data encoded in one of the following: a high dynamic range (HDR) image format, an RGB color space associated with the Academy Color Encoding Specification (ACES) standard of the Academy of Motion Picture Arts and Sciences (AMPAS), the P3 color space standard of the Digital Cinema Initiatives Alliance, the Reference Input Media Metrics/Reference Output Media Metrics (RIMM/ROMM) standards, an sRGB color space, an RGB color space, or a YCbCr color space.

图11B示出了根据本发明的示例实施方式的示例处理流。在一些示例实施方式中，一个或更多个计算设备或硬件部件可以执行该处理流。在框1152中，多层视频解码器(例如，如图10所示)以基本层(BL)信号、增强层(EL)信号和参考处理单元(RPU)信号接收BL数据、EL数据和RPU数据，BL数据、EL数据和RPU数据与共用视觉动态范围(VDR)源图像相关联。FIG11B illustrates an example processing flow according to an example embodiment of the present invention. In some example embodiments, one or more computing devices or hardware components may perform the processing flow. In block 1152, a multi-layer video decoder (e.g., as shown in FIG10 ) receives base layer (BL) data, enhancement layer (EL) data, and reference processing unit (RPU) data as BL signals, EL data, and RPU data associated with a common visual dynamic range (VDR) source image.

在框1154中，多层视频解码器将RPU数据解码成包括序列级、帧级或分区级的多个语法元素的编码语法。In block 1154 , the multi-layer video decoder decodes the RPU data into coding syntax including a plurality of syntax elements at a sequence level, a frame level, or a partition level.

在框1156中，多层视频解码器根据编码语法将BL数据和EL数据转换成重构的VDR图像。In block 1156 , the multi-layer video decoder converts the BL data and the EL data into a reconstructed VDR image according to the coding syntax.

在框1158中，多层视频解码器输出重构的VDR图像。In block 1158, the multi-layer video decoder outputs a reconstructed VDR image.

在一种实施方式中，多层视频解码器还被配置成执行：从一个或更多个当前RPU数据单元确定编码语法符合的具体VDR规范；以及从一个或更多个当前RPU数据单元获得编码语法的至少一部分。In one embodiment, the multi-layer video decoder is further configured to: determine a specific VDR specification to which the coding syntax complies from one or more current RPU data units; and obtain at least a portion of the coding syntax from the one or more current RPU data units.

在一种实施方式中，多层视频解码器还被配置成执行：从一个或更多个当前RPU数据单元确定能够根据一个或更多个当前RPU数据单元中的一个或更多个其他分区来预测的编码语法中的多个语法元素中的至少一个语法元素。In one embodiment, the multi-layer video decoder is further configured to perform: determining, from one or more current RPU data units, at least one syntax element of multiple syntax elements in the coding syntax that can be predicted based on one or more other partitions in the one or more current RPU data units.

在一种实施方式中，多层视频解码器还被配置成执行：从一个或更多个当前RPU数据单元确定能够根据与先前重构的VDR图像有关的一个或更多个先前RPU数据单元来预测的编码语法中的多个语法元素中的至少一个语法元素。In one embodiment, the multi-layer video decoder is further configured to perform: determining, from one or more current RPU data units, at least one syntax element of a plurality of syntax elements in a coding syntax that can be predicted based on one or more previous RPU data units related to a previously reconstructed VDR image.

在一种实施方式中，重构的VDR图像和先前重构的VDR图像属于重构的VDR图像的序列；重构的VDR图像的序列共享序列级的语法元素的共用集合。In one embodiment, the reconstructed VDR image and the previously reconstructed VDR image belong to a sequence of reconstructed VDR images; the sequence of reconstructed VDR images shares a common set of syntax elements at the sequence level.

在一种实施方式中，重构的VDR图像和先前重构的VDR图像属于两个不同的重构VDR图像序列；两个不同的重构VDR图像序列中的第一序列共享序列级的语法元素的第一共用集合；以及两个不同的重构VDR图像序列中的第二序列共享序列级的语法元素的不同的第二共用集合。In one embodiment, the reconstructed VDR image and the previously reconstructed VDR image belong to two different reconstructed VDR image sequences; a first sequence in the two different reconstructed VDR image sequences shares a first common set of sequence-level grammatical elements; and a second sequence in the two different reconstructed VDR image sequences shares a different second common set of sequence-level grammatical elements.

在一种实施方式中，EL数据包括重构的VDR图像的序列中的两个或更多个重构的VDR图像的层间参考图片，并且两个或更多个重构的VDR图像包括重构的VDR图像。In one embodiment, the EL data comprises inter-layer reference pictures of two or more reconstructed VDR images in a sequence of reconstructed VDR images, and the two or more reconstructed VDR images comprise the reconstructed VDR image.

在一种实施方式中，重构的VDR图像包括在第一颜色空间中编码的图像数据；EL数据包括在第二颜色空间中编码的图像数据；至少部分地基于从BL数据获得的映射数据来生成重构的VDR图像；并且映射数据包括在第三颜色空间中编码的映射图像数据。In one embodiment, the reconstructed VDR image includes image data encoded in a first color space; the EL data includes image data encoded in a second color space; the reconstructed VDR image is generated at least in part based on mapping data obtained from the BL data; and the mapping data includes mapped image data encoded in a third color space.

在一种实施方式中，第一颜色空间、第二颜色空间和第三颜色空间中的至少两个不同。在一种实施方式中，第一颜色空间、第二颜色空间和第三颜色空间中的至少两个相同。In one embodiment, at least two of the first color space, the second color space, and the third color space are different. In one embodiment, at least two of the first color space, the second color space, and the third color space are the same.

在一种实施方式中，多层视频解码器还被配置成执行：将使用一个或更多个输入视频信号表示、接收、发送或存储的图像数据转换成使用一个或更多个输出视频信号表示、接收、发送或存储的一个或更多个输出VDR图像。In one embodiment, the multi-layer video decoder is further configured to perform: converting image data represented, received, transmitted or stored using one or more input video signals into one or more output VDR images represented, received, transmitted or stored using one or more output video signals.

在一种实施方式中，重构的VDR图像包括以下列项之一编码的图像数据：高动态范围(HDR)图像格式、与电影艺术和科学研究院(AMPAS)的学院颜色编码规范(ACES)标准相关联的RGB颜色空间、数字影院倡导联盟的P3颜色空间标准、参考输入媒体度量/参考输出媒介度量(RIMM/ROMM)标准、sRGB颜色空间、RGB颜色空间或YCbCr颜色空间。In one embodiment, the reconstructed VDR image includes image data encoded in one of the following: a high dynamic range (HDR) image format, an RGB color space associated with the Academy Color Encoding Specification (ACES) standard of the Academy of Motion Picture Arts and Sciences (AMPAS), the P3 color space standard of the Digital Cinema Initiatives Alliance, the Reference Input Media Metrics/Reference Output Media Metrics (RIMM/ROMM) standards, an sRGB color space, an RGB color space, or a YCbCr color space.

在各种示例性实施方式中，编码器、解码器、系统、装置或一个或更多个其他计算设备执行所描述的上述方法中的任意方法或部分。In various exemplary embodiments, an encoder, a decoder, a system, an apparatus, or one or more other computing devices performs any or part of the methods described above.

9.实现机制—硬件概述9. Implementation Mechanism - Hardware Overview

根据一种实施方式，可以通过一个或更多个专用计算设备来实现本文所描述的技术。专用计算设备可以被硬连接来执行技术，或者可以包括数字电子设备如永久被编程来执行本技术的一个或更多个专用集成电路(ASIC)或现场可编程门阵列(FPGA)，或者可以包括被编程成依照固件、存储器、其他存储装置或组合中的程序指令来执行本技术的一个或更多个通用硬件处理器。这种专用计算设备也可以将自定义的硬布线逻辑、ASIC或FPGA与自定义编程组合来完成这些技术。专用计算设备可以是台式计算机系统、便携式计算机系统、手持设备、网络设备或合并硬连线和/或程序逻辑来实现本技术的任何其他设备。According to one embodiment, the technology described herein can be realized by one or more special-purpose computing devices.Special-purpose computing devices can be hard-wired to perform technology, or can include digital electronic devices such as permanently being programmed to perform one or more application-specific integrated circuits (ASICs) or field programmable gate arrays (FPGAs) of the present technology, or can include being programmed to perform one or more general-purpose hardware processors of the present technology according to the program instructions in firmware, memory, other storage devices or combination.This special-purpose computing device can also complete these technologies with customized hard-wired logic, ASIC or FPGA and customized programming combination.Special-purpose computing devices can be desktop computer systems, portable computer systems, handheld devices, network equipment or any other equipment that merges hard wiring and/or program logic to realize the present technology.

例如，图12是示出计算机系统1200的框图，可以在计算机系统1200上实现本发明的示例实施方式。计算机系统1200包括总线1202或用于传送信息的其他通信机制，以及与总线1202耦合用于处理信息的硬件处理器1204。硬件处理器1204可以是例如通用微处理器。For example, Figure 12 is a block diagram illustrating a computer system 1200 upon which example embodiments of the present invention may be implemented. Computer system 1200 includes a bus 1202 or other communication mechanism for communicating information, and a hardware processor 1204 coupled with bus 1202 for processing information. Hardware processor 1204 may be, for example, a general-purpose microprocessor.

计算机系统1200还包括耦合至总线1202的主存储器1206如随机存取存储器(RAM)或其他动态存储设备，用于存储信息和要由处理器1204执行的指令。主存储器1206还可以用于存储在执行要由处理器1204执行的指令期间的临时变量或其他中间信息。这种指令当被存储在可由处理器1204访问的非暂态存储介质中时使计算机系统1200呈现为被定制成执行指令中所指定的操作的专用机器。The computer system 1200 also includes a main memory 1206, such as a random access memory (RAM) or other dynamic storage device, coupled to the bus 1202 for storing information and instructions to be executed by the processor 1204. The main memory 1206 may also be used for storing temporary variables or other intermediate information during the execution of instructions to be executed by the processor 1204. Such instructions, when stored in a non-transitory storage medium accessible by the processor 1204, cause the computer system 1200 to appear as a special-purpose machine customized to perform the operations specified in the instructions.

计算机系统1200还包括用于存储处理器1204的静态信息和指令的、耦合至总线1202的只读存储器(ROM)1208或其他静态存储装置。提供了存储装置1210如磁盘或光盘并且其耦合至总线1202用于存储信息和指令。Computer system 1200 also includes a read only memory (ROM) 1208 or other static storage device coupled to bus 1202 for storing static information and instructions for processor 1204. A storage device 1210, such as a magnetic or optical disk, is provided and coupled to bus 1202 for storing information and instructions.

计算机系统1200可以经由总线1202耦合至显示器1212如液晶显示器，用于向计算机用户显示信息。包括字母数字和其他键的输入装置1214耦合至总线1202用于将信息和命令选择传送至处理器1204。另一种类型的用户输入装置是光标控制1216如鼠标、轨迹球或光标方向键，用于将方向信息和命令选择传送至处理器1204以及用于控制显示器1212上的光标移动。该输入装置通常具有两个轴——第一轴(例如，x)和第二轴(例如，y)——上的两个自由度，使得该装置能够指定平面中的位置。The computer system 1200 may be coupled to a display 1212, such as a liquid crystal display, via the bus 1202 for displaying information to a computer user. An input device 1214, including alphanumeric and other keys, is coupled to the bus 1202 for communicating information and command selections to the processor 1204. Another type of user input device is a cursor control 1216, such as a mouse, trackball, or cursor direction keys, for communicating directional information and command selections to the processor 1204 and for controlling cursor movement on the display 1212. The input device typically has two degrees of freedom along two axes—a first axis (e.g., x) and a second axis (e.g., y)—that enables the device to specify a position in a plane.

计算机系统1200可以使用以下项来实现本文所描述技术：定制的硬布线逻辑、一个或更多个ASIC或FPGA、结合计算机系统使计算机系统1200或将计算机系统1200编程为专用机器的固件和/或程序逻辑。根据一种实施方式，响应于处理器1204执行主存储器1206中包括的一个或更多个指令的一个或更多个序列，计算机系统1200执行本文中的技术。可以从其他存储介质如存储装置1210将这些指令读入主存储器1206。主存储器1206中包括的指令序列的执行使处理器1204执行本文所描述的处理步骤。在替选实施方式中，硬连线电路可以替代或结合软件指令来使用。Computer system 1200 can implement the techniques described herein using custom hard-wired logic, one or more ASICs or FPGAs, firmware and/or program logic that, in conjunction with the computer system, enables computer system 1200 or programs computer system 1200 as a special-purpose machine. According to one embodiment, computer system 1200 performs the techniques described herein in response to processor 1204 executing one or more sequences of one or more instructions contained in main memory 1206. These instructions may be read into main memory 1206 from other storage media, such as storage device 1210. Execution of the sequences of instructions contained in main memory 1206 causes processor 1204 to perform the process steps described herein. In alternative embodiments, hard-wired circuitry may be used in place of or in combination with software instructions.

如本文所使用的，术语“存储介质”指代存储数据和/或使机器以特定方式进行操作的指令的任意非暂态介质。这种存储介质可以包括非易失性介质和/或易失性介质。非易失性介质包括例如光盘或磁盘如存储装置1210。易失性介质包括动态存储器如主存储器1206。存储介质的常见形式包括例如软盘、软磁盘、硬盘、固态驱动器、磁带，或任何其他磁性数据存储介质、CD-ROM、任何其其他光学数据存储介质、具有孔图案的任何物理介质、RAM、PROM和EPROM、闪存EPROM、NVRAM、任何其他存储芯片或盒式磁带。As used herein, the term "storage media" refers to any non-transitory medium that stores data and/or instructions that cause a machine to operate in a specific manner. Such storage media may include non-volatile media and/or volatile media. Non-volatile media include, for example, optical or magnetic disks such as storage device 1210. Volatile media include dynamic memory such as main memory 1206. Common forms of storage media include, for example, floppy disks, diskettes, hard disks, solid-state drives, magnetic tape, or any other magnetic data storage medium, CD-ROMs, any other optical data storage medium, any physical medium with a pattern of holes, RAM, PROM and EPROM, Flash EPROM, NVRAM, any other memory chip, or a cassette tape.

存储介质与传输介质不同，但可以结合传输介质使用。传输介质参与存储介质之间传输信息。例如，传输介质包括同轴电缆、铜线和光纤，包括包含总线1202的导线。传输介质还可以采取如在无线电波和红外数据通信期间产生的声波或光波的形式。Storage media are distinct from, but can be used in conjunction with, transmission media. Transmission media participate in the transfer of information between storage media. For example, transmission media include coaxial cables, copper wire, and optical fiber, including the wires that comprise bus 1202. Transmission media can also take the form of sound or light waves, such as those generated during radio wave and infrared data communications.

在将具有一个或更多个指令的一个或更多个序列携带至处理器1204用于执行的过程中可以涉及各种形式的介质。例如，最初可以在远程计算机的磁盘或固态驱动器上携带指令。远程计算机可以将指令加载至其动态存储器中并且使用调制解调器通过电话线发送指令。本地计算机系统1200的调制解调器可以接收电话线上的数据并且使用红外发射器将该数据转换为红外信号。红外探测器可以接收红外信号中携带的数据，并且适当的电路可以将该数据置于总线1202上。总线1202将数据携带至主存储器1206，处理器1204从主存储器1206取回并执行指令。可选地，主存储器1206所接收的指令可以在被处理器1204执行之前或之后存储在存储装置1210上。Various forms of media may be involved in carrying one or more sequences of one or more instructions to the processor 1204 for execution. For example, the instructions may initially be carried on a disk or solid-state drive of a remote computer. The remote computer may load the instructions into its dynamic memory and send the instructions over a telephone line using a modem. The modem of the local computer system 1200 may receive the data on the telephone line and convert the data into an infrared signal using an infrared transmitter. An infrared detector may receive the data carried in the infrared signal, and appropriate circuitry may place the data on the bus 1202. The bus 1202 carries the data to the main memory 1206, from which the processor 1204 retrieves and executes the instructions. Alternatively, the instructions received by the main memory 1206 may be stored on the storage device 1210 before or after being executed by the processor 1204.

计算机系统1200还包括耦接至总线1202的通信接口1218。通信接口1218提供耦合至网络链接1220的双向数据通信，网络链接1220连接至本地网络1222。例如，通信接口1218可以是综合服务数字网(ISDN)卡、线缆调制解调器、卫星调制解调器或用于将数据通信连接提供给相应类型的电话线的调制解调器。作为另一示例，通信接口1218可以是将数据通信连接提供给兼容的局域网(LAN)的LAN卡。也可以实现无线链接。在任何这样的实现中，通信接口1218发送和接收携带有表示各种类型的信息的数字数据流的电信号、电磁信号或光信号。Computer system 1200 also includes a communication interface 1218 coupled to bus 1202. Communication interface 1218 provides two-way data communication coupled to network link 1220, which is connected to local network 1222. For example, communication interface 1218 can be an integrated services digital network (ISDN) card, a cable modem, a satellite modem, or a modem for providing a data communication connection to a corresponding type of telephone line. As another example, communication interface 1218 can be a LAN card that provides a data communication connection to a compatible local area network (LAN). Wireless links can also be implemented. In any such implementation, communication interface 1218 sends and receives electrical, electromagnetic, or optical signals that carry digital data streams representing various types of information.

网络链接1220通常通过一个或更多个网络向其他数据设备提供数据通信。例如，网络链接1220可以通过本地网络1222向主机计算机1224或由因特网服务提供商(ISP)1226操作的数据设备提供连接。ISP 1226又通过现在通常被称为“因特网”1228的全球分组数据通信网络提供数据通信服务。本地网络1222和因特网1228二者都使用携带数字数据流的电信号、电磁信号或光信号。向计算机系统1200携带数字数据以及从计算机系统1200携带数字数据的、通过各种网络的信号和网络链接1220上的信号以及通过通信接口1218的信号是传输介质的示例形式。The network link 1220 typically provides data communication through one or more networks to other data devices. For example, the network link 1220 can provide a connection through the local network 1222 to a host computer 1224 or to data equipment operated by an Internet Service Provider (ISP) 1226. The ISP 1226, in turn, provides data communication services through the global packet data communication network now commonly referred to as the "Internet" 1228. Both the local network 1222 and the Internet 1228 use electrical, electromagnetic, or optical signals that carry digital data streams. The signals through the various networks and the signals on the network link 1220 and through the communication interface 1218 that carry the digital data to and from the computer system 1200 are example forms of transmission media.

计算机系统1200可以通过网络、网络链接1220和通信接口1218发送消息和接收包括程序代码的数据。在因特网示例中，服务器1230可以通过因特网1228、ISP 1226、本地网络1222和通信接口1218为应用程序发送所请求的代码。Computer system 1200 can send messages and receive data, including program code, through the network, network link 1220, and communication interface 1218. In the Internet example, server 1230 can send the requested code for an application program through Internet 1228, ISP 1226, local network 1222, and communication interface 1218.

当被接收时，所接收的代码可以由处理器1204执行，和/或存储在存储装置1210或其他非易失性存储器中用于后续执行。When received, the received code may be executed by processor 1204 and/or stored in storage device 1210 or other non-volatile storage for later execution.

10.等同物、扩展、替代和其他事项10. Equivalents, Extensions, Substitutions and Other Matters

在上述说明书中，已经参照因实现而异的多种具体细节来描述了本发明的示例实施方式。因此，本发明是什么并且申请人意在本发明是什么的唯一并且排他的指示是以其中包括任何后续改正的权利要求发布的具体形式的根据本申请发布的一组权利要求。关于权利要求中包括的术语的本文明确阐述的任何定义可以支配权利要求中所使用的这些术语的含义。因此，在权利要求中没有明确阐述的限制、元素、属性、特征、优点或特性不应当以任何方式限制权利要求的范围。因此，本说明书和附图被认为是说明性的，而非限制性意义。In the foregoing specification, example embodiments of the present invention have been described with reference to numerous specific details that vary from implementation to implementation. Therefore, the sole and exclusive indicator of what the invention is, and what the applicants intend it to be, is the set of claims that issue from this application, in the specific form in which the claims issue, including any subsequent amendments. Any definitions expressly set forth herein for terms included in the claims may govern the meaning of those terms as used in the claims. Therefore, no limitation, element, attribute, feature, advantage, or characteristic that is not expressly set forth in a claim should limit the scope of the claim in any way. Accordingly, the specification and drawings are to be regarded in an illustrative rather than a restrictive sense.

此外，本公开实施例还包括：In addition, the present disclosure also includes:

(1)一种方法，包括：(1) A method comprising:

接收输入视觉动态范围(VDR)图像以及与所述输入VDR图像相关联的输入基本层(BL)图像；receiving an input visual dynamic range (VDR) image and an input base layer (BL) image associated with the input VDR image;

生成包括序列级、帧级或分区级中至少之一的多个语法元素的编码语法；generating a coding syntax comprising a plurality of syntax elements at at least one of a sequence level, a frame level, or a partition level;

根据所述编码语法将所述输入BL图像和所述输入VDR图像转换成BL数据和增强层(EL)数据；converting the input BL image and the input VDR image into BL data and enhancement layer (EL) data according to the coding syntax;

将所述编码语法元素转换成参考处理单元(RPU)数据；以及converting the coding syntax elements into reference processing unit (RPU) data; and

以BL信号、EL信号和RPU信号输出所述BL数据、所述EL数据和所述RPU数据。The BL data, the EL data, and the RPU data are output as a BL signal, an EL signal, and an RPU signal.

(2)根据(1)所述的方法，还包括：(2) The method according to (1), further comprising:

至少部分地基于所述编码语法来生成一个或更多个当前RPU数据单元；以及generating one or more current RPU data units based at least in part on the encoding syntax; and

在所述一个或更多个当前RPU数据单元中确定所述编码语法符合的特定VDR规范。A specific VDR specification to which the encoding syntax complies is determined in the one or more current RPU data units.

(3)根据(2)所述的方法，其中，所述一个或更多个当前RPU数据单元中至少之一包括能够支持多个不同VDR规范中的任意一个规范的数据结构。(3) The method according to (2), wherein at least one of the one or more current RPU data units includes a data structure capable of supporting any one of a plurality of different VDR specifications.

(4)根据(2)所述的方法，还包括：在所述一个或更多个当前RPU数据单元中，将所述编码语法的所述多个语法元素中的至少一个语法元素指示为能够根据所述一个或更多个当前RPU数据单元的一个或更多个其他分区预测。(4) The method according to (2) further includes: in the one or more current RPU data units, indicating at least one syntax element of the multiple syntax elements of the encoding syntax as capable of being predicted based on one or more other partitions of the one or more current RPU data units.

(5)根据(2)所述的方法，还包括：在所述一个或更多个当前RPU数据单元中，将所述编码语法中的所述多个语法元素中的至少一个语法元素指示为能够根据先前输入VDR图像以及与所述先前输入VDR图像相关联的先前输入BL图像的一个或更多个先前RPU数据单元预测。(5) The method according to (2) further includes: in the one or more current RPU data units, indicating at least one syntax element of the multiple syntax elements in the encoding syntax as being capable of being predicted based on one or more previous RPU data units of a previous input VDR image and a previous input BL image associated with the previous input VDR image.

(6)根据(5)所述的方法，其中，所述输入VDR图像和所述先前输入VDR图像属于输入VDR图像序列，并且其中所述输入VDR图像序列共享序列级的语法元素的共用集合。(6) The method according to (5), wherein the input VDR image and the previous input VDR image belong to a sequence of input VDR images, and wherein the sequence of input VDR images shares a common set of sequence-level syntax elements.

(7)根据(5)所述的方法，其中，所述输入VDR图像和所述先前输入VDR图像属于两个不同的输入VDR图像序列；其中所述两个不同的输入VDR图像序列中的第一序列共享序列级的语法元素的第一共用集合；并且其中所述两个不同的输入VDR图像序列中的第二序列共享序列级的语法元素的不同的第二共用集合。(7) A method according to (5), wherein the input VDR image and the previous input VDR image belong to two different input VDR image sequences; wherein a first sequence of the two different input VDR image sequences shares a first common set of sequence-level grammatical elements; and wherein a second sequence of the two different input VDR image sequences shares a different second common set of sequence-level grammatical elements.

(8)根据(1)所述的方法，其中，所述多个语法元素中的至少一个语法元素能够用作序列级、帧级或分区级中的两种或更多种的语法元素。(8) The method according to (1), wherein at least one of the plurality of syntax elements can be used as a syntax element at two or more of a sequence level, a frame level, or a partition level.

(9)根据(1)所述的方法，其中，所述BL数据表示被优化用于在SDR显示器上查看的标准动态范围(SDR)图像。(9) The method of (1), wherein the BL data represents a standard dynamic range (SDR) image optimized for viewing on an SDR display.

(10)根据(1)所述的方法，其中，所述BL数据不表示被优化用于在SDR显示器上查看的标准动态范围(SDR)图像。(10) The method of (1), wherein the BL data does not represent a standard dynamic range (SDR) image optimized for viewing on an SDR display.

(11)根据(1)所述的方法，其中，所述EL数据包括所述输入VDR图像与基于所述BL数据生成的预测VDR图像之间的残差值。(11) The method according to (1), wherein the EL data includes a residual value between the input VDR image and a predicted VDR image generated based on the BL data.

(12)根据(1)所述的方法，其中，所述EL数据包括输入VDR图像序列中的两个或更多个输入VDR图像的层间参考图片，并且其中所述两个或更多个输入VDR图像包括所述输入VDR图像。(12) The method according to (1), wherein the EL data includes inter-layer reference pictures of two or more input VDR images in an input VDR image sequence, and wherein the two or more input VDR images include the input VDR image.

(13)根据(1)所述的方法，其中，所述多个语法元素包括下列项中的一个或更多个：参数；系数；主元值；标志，其指示存在或不存在与所述标志相对应的操作；或者包括显示管理元数据的一个或更多个类型的元数据。(13) A method according to (1), wherein the multiple syntax elements include one or more of the following items: a parameter; a coefficient; a main element value; a flag indicating the presence or absence of an operation corresponding to the flag; or one or more types of metadata including display management metadata.

(14)根据(1)所述的方法，其中，所述输入VDR图像包括在输入颜色空间中编码的图像数据，其中所述EL数据包括在输出颜色空间中编码的图像数据，其中至少部分地基于映射数据生成所述EL数据，其中至少部分地基于所述BL数据生成所述映射数据，并且其中所述映射数据包括在映射颜色空间中编码的映射图像数据。(14) A method according to (1), wherein the input VDR image includes image data encoded in an input color space, wherein the EL data includes image data encoded in an output color space, wherein the EL data is generated at least in part based on mapping data, wherein the mapping data is generated at least in part based on the BL data, and wherein the mapping data includes mapped image data encoded in a mapped color space.

(15)根据(14)所述的方法，其中，所述输入颜色空间、所述输出颜色空间、所述映射颜色空间或编码颜色空间中的至少两个是不同的。(15) The method of (14), wherein at least two of the input color space, the output color space, the mapped color space, or the encoded color space are different.

(16)根据(14)所述的方法，其中，所述输入颜色空间、所述输出颜色空间、所述映射颜色空间或编码颜色空间中的至少两个是相同的。(16) The method of (14), wherein at least two of the input color space, the output color space, the mapped color space, or the encoded color space are the same.

(17)根据(1)所述的方法，其中，所述EL数据包括以第一色度格式编码的图像数据，并且其中所述BL数据包括以不同的第二色度格式编码的图像数据。(17) The method of (1), wherein the EL data includes image data encoded in a first chroma format, and wherein the BL data includes image data encoded in a second, different chroma format.

(18)根据(1)所述的方法，其中，所述EL数据包括以色度格式编码的图像数据，并且其中所述BL包括以相同色度格式编码的图像数据。(18) The method of (1), wherein the EL data includes image data encoded in a chroma format, and wherein the BL includes image data encoded in the same chroma format.

(19)根据(1)所述的方法，其中，所述多个语法元素指定以下操作中的一个或更多个：色度重采样操作、逆映射操作、基于非重叠区域的预测操作、基于重叠区域的预测操作、残差非线性量化和去量化操作、残差色度重采样操作、残差空间上采样操作、包括插值的数据处理操作、或显示管理操作。(19) A method according to (1), wherein the multiple syntax elements specify one or more of the following operations: a chroma resampling operation, an inverse mapping operation, a non-overlapping area based prediction operation, an overlapping area based prediction operation, a residual nonlinear quantization and dequantization operation, a residual chroma resampling operation, a residual spatial upsampling operation, a data processing operation including interpolation, or a display management operation.

(20)根据(1)所述的方法，还包括：将使用一个或更多个输入视频信号表示、接收、发送或存储的一个或更多个输入VDR图像转换成使用一个或更多个输出视频信号表示、接收、发送或存储的一个或更多个输出VDR图像。(20) The method according to (1) further includes: converting one or more input VDR images represented, received, sent or stored using one or more input video signals into one or more output VDR images represented, received, sent or stored using one or more output video signals.

(21)根据(1)所述的方法，其中，所述输入VDR图像包括以以下格式之一编码的图像数据：高动态范围(HDR)图像格式、与电影艺术和科学研究院(AMPAS)的学院颜色编码规范(ACES)标准相关联的RGB颜色空间、数字影院倡导联盟的P3颜色空间标准、参考输入媒体度量/参考输出媒体度量(RIMM/ROMM)标准、sRGB颜色空间、RGB颜色空间、或YCbCr颜色空间。(21) The method of (1), wherein the input VDR image comprises image data encoded in one of the following formats: a high dynamic range (HDR) image format, an RGB color space associated with the Academy Color Encoding Specification (ACES) standard of the Academy of Motion Picture Arts and Sciences (AMPAS), the P3 color space standard of the Digital Cinema Initiatives Alliance, the Reference Input Media Metrics/Reference Output Media Metrics (RIMM/ROMM) standard, an sRGB color space, an RGB color space, or a YCbCr color space.

(22)一种方法，包括：(22) A method comprising:

以基本层(BL)信号、增强层(EL)信号和参考处理单元(RPU)信号接收BL数据、EL数据和RPU数据，所述BL数据、所述EL数据和所述RPU数据与共用的视觉动态范围(VDR)源图像相关联；receiving, as a base layer (BL) signal, an enhancement layer (EL) signal, and a reference processing unit (RPU) signal, BL data, EL data, and RPU data associated with a common visual dynamic range (VDR) source image;

将所述RPU数据解码成包括序列级、帧级或分区级中至少之一的多个语法元素的编码语法；decoding the RPU data into a coding syntax comprising a plurality of syntax elements at at least one of a sequence level, a frame level, or a partition level;

根据所述编码语法将所述BL数据和所述EL数据转换成重构的VDR图像；以及converting the BL data and the EL data into a reconstructed VDR image according to the coding syntax; and

输出所述重构的VDR图像。The reconstructed VDR image is output.

(23)根据(22)所述的方法，还包括：(23) The method according to (22), further comprising:

根据一个或更多个当前RPU数据单元确定所述编码语法符合的特定VDR规范；以及determining, based on one or more current RPU data units, a particular VDR specification to which the encoding syntax complies; and

根据所述一个或更多个当前RPU数据单元得出所述编码语法元素的至少一部分。At least a portion of the encoding syntax element is derived from the one or more current RPU data units.

(24)根据(23)所述的方法，其中，所述一个或更多个当前RPU数据单元的至少之一包括能够支持多个不同VDR规范中的任意一个规范的数据结构。(24) The method according to (23), wherein at least one of the one or more current RPU data units includes a data structure capable of supporting any one of a plurality of different VDR specifications.

(25)根据(23)所述的方法，还包括：根据所述一个或更多个当前RPU数据单元，将所述编码语法的所述多个语法元素中的至少一个语法元素确定为能够根据所述一个或更多个当前RPU数据单元的一个或更多个其他分区预测。(25) The method according to (23) further includes: determining, based on the one or more current RPU data units, at least one syntax element of the multiple syntax elements of the encoding syntax as being capable of being predicted based on one or more other partitions of the one or more current RPU data units.

(26)根据(23)所述的方法，还包括：根据所述一个或更多个当前RPU数据单元，将所述编码语法的所述多个语法元素中的至少一个语法元素确定为能够根据与先前重构的VDR图像有关的一个或更多个先前RPU数据单元预测。(26) The method according to (23) further includes: determining, based on the one or more current RPU data units, at least one syntax element of the multiple syntax elements of the coding syntax as being capable of being predicted based on one or more previous RPU data units related to a previously reconstructed VDR image.

(27)根据(26)所述的方法，其中，所述重构的VDR图像和所述先前重构的VDR图像属于重构的VDR图像的序列，并且其中所述重构的VDR图像的序列共享序列级的语法元素的共用集合。(27) The method of (26), wherein the reconstructed VDR image and the previously reconstructed VDR image belong to a sequence of reconstructed VDR images, and wherein the sequence of reconstructed VDR images shares a common set of sequence-level syntax elements.

(28)根据(26)所述的方法，其中，所述重构的VDR图像和所述先前重构的VDR图像属于两个不同的重构VDR图像序列；其中，所述两个不同的重构VDR图像序列中的第一序列共享序列级的语法元素的第一共用集合；并且其中所述两个不同的重构VDR图像序列中的第二序列共享序列级的语法元素的不同的第二共用集合。(28) A method according to (26), wherein the reconstructed VDR image and the previously reconstructed VDR image belong to two different reconstructed VDR image sequences; wherein a first sequence in the two different reconstructed VDR image sequences shares a first common set of sequence-level syntactic elements; and wherein a second sequence in the two different reconstructed VDR image sequences shares a different second common set of sequence-level syntactic elements.

(29)根据(22)所述的方法，其中，所述多个语法元素中的至少一个语法元素能够用作序列级、帧级或分区级中的两种或更多种的语法元素。(29) The method according to (22), wherein at least one of the plurality of syntax elements can be used as a syntax element at two or more of a sequence level, a frame level, or a partition level.

(30)根据(22)所述的方法，其中，所述BL数据表示被优化用于在SDR显示器上查看的标准动态范围(SDR)图像。(30) The method of (22), wherein the BL data represents a standard dynamic range (SDR) image optimized for viewing on an SDR display.

(31)根据(22)所述的方法，其中，所述BL数据不表示被优化用于在SDR显示器上查看的标准动态范围(SDR)图像。(31) The method of (22), wherein the BL data does not represent a standard dynamic range (SDR) image optimized for viewing on an SDR display.

(32)根据(22)所述的方法，其中，所述EL数据包括VDR图像与基于所述BL数据生成的预测VDR图像之间的残差值。(32) The method according to (22), wherein the EL data includes a residual value between a VDR image and a predicted VDR image generated based on the BL data.

(33)根据(22)所述的方法，其中，所述EL数据包括重构的VDR图像的序列中的两个或更多个重构的VDR图像的层间参考图片，并且其中所述两个或更多个重构的VDR图像包括所述重构的VDR图像。(33) A method according to (22), wherein the EL data includes inter-layer reference pictures of two or more reconstructed VDR images in a sequence of reconstructed VDR images, and wherein the two or more reconstructed VDR images include the reconstructed VDR image.

(34)根据(22)所述的方法，其中，所述多个语法元素包括下列项中的一个或更多个：参数；系数；主元值；标志，其指示存在或不存在与所述标志相对应的操作；或包括显示管理元数据的一个或更多个类型的元数据。(34) A method according to (22), wherein the multiple syntax elements include one or more of the following items: a parameter; a coefficient; a main element value; a flag indicating the presence or absence of an operation corresponding to the flag; or one or more types of metadata including display management metadata.

(35)根据(22)所述的方法，其中，所述重构的VDR图像包括在第一颜色空间中编码的图像数据，其中所述EL数据包括在第二颜色空间中编码的图像数据，其中至少部分地基于从所述BL数据得出的映射数据来生成所述重构的VDR图像，并且其中所述映射数据包括在第三颜色空间中编码的映射图像数据。(35) A method according to (22), wherein the reconstructed VDR image includes image data encoded in a first color space, wherein the EL data includes image data encoded in a second color space, wherein the reconstructed VDR image is generated at least in part based on mapping data derived from the BL data, and wherein the mapping data includes mapped image data encoded in a third color space.

(36)根据(35)所述的方法，其中，所述第一颜色空间、所述第二颜色空间和所述第三颜色空间中的至少两个是不同的。(36) The method according to (35), wherein at least two of the first color space, the second color space, and the third color space are different.

(37)根据(35)所述的方法，其中，所述第一颜色空间、所述第二颜色空间和所述第三颜色空间中的至少两个是相同的。(37) The method according to (35), wherein at least two of the first color space, the second color space, and the third color space are the same.

(38)根据(22)所述的方法，其中，所述EL数据包括以第一色度格式编码的图像数据，并且其中所述BL数据包括以不同的第二色度格式编码的图像数据。(38) The method of (22), wherein the EL data includes image data encoded in a first chroma format, and wherein the BL data includes image data encoded in a second, different chroma format.

(39)根据(22)所述的方法，其中，所述EL数据包括以色度格式编码的图像数据，并且其中所述BL包括以相同色度格式编码的图像数据。(39) The method according to (22), wherein the EL data includes image data encoded in a chroma format, and wherein the BL includes image data encoded in the same chroma format.

(40)根据(22)所述的方法，其中，所述多个语法元素用信号通知以下操作中的一个或更多个：色度重采样操作、逆映射操作、基于非重叠区域的预测操作、基于重叠区域的预测操作、残差非线性量化和去量化操作、残差色度重采样操作、空间缩放操作、包括插值的数据处理操作、或显示管理操作。(40) A method according to (22), wherein the multiple syntax elements signal one or more of the following operations: chroma resampling operations, inverse mapping operations, non-overlapping area based prediction operations, overlapping area based prediction operations, residual nonlinear quantization and dequantization operations, residual chroma resampling operations, spatial scaling operations, data processing operations including interpolation, or display management operations.

(41)根据(22)所述的方法，还包括：将使用一个或更多个输入视频信号表示、接收、发送或存储的图像数据转换成使用一个或更多个输出视频信号表示、接收、发送或存储的一个或更多个输出VDR图像。(41) The method according to (22) also includes: converting image data represented, received, sent or stored using one or more input video signals into one or more output VDR images represented, received, sent or stored using one or more output video signals.

(42)根据(22)所述的方法，其中，所述重构的VDR图像包括以以下格式中之一编码的图像数据：高动态范围(HDR)图像格式、与电影艺术和科学研究院(AMPAS)的学院颜色编码规范(ACES)标准相关联的RGB颜色空间、数字影院倡导联盟的P3颜色空间标准、参考输入媒体度量/参考输出媒体度量(RIMM/ROMM)标准、sRGB颜色空间、RGB颜色空间、或YCbCr颜色空间。(42) The method of (22), wherein the reconstructed VDR image includes image data encoded in one of the following formats: a high dynamic range (HDR) image format, an RGB color space associated with the Academy Color Encoding Specification (ACES) standard of the Academy of Motion Picture Arts and Sciences (AMPAS), the P3 color space standard of the Digital Cinema Initiatives Alliance, the Reference Input Media Metrics/Reference Output Media Metrics (RIMM/ROMM) standard, an sRGB color space, an RGB color space, or a YCbCr color space.

(43)一种执行根(1)至(21)所述的方法中的任意方法的编码器。(43) An encoder that performs any of the methods described in (1) to (21).

(44)一种执行根据(22)至(42)所述的方法中的任意方法的解码器。(44) A decoder that performs any of the methods described in (22) to (42).

(45)一种执行根据(1)至(42)所述的方法中的任意方法的系统。(45) A system for performing any of the methods described in (1) to (42).

(46)一种系统，包括：(46) A system comprising:

被配置成执行以下操作的编码器：An encoder configured to:

生成包括序列级、帧级或分区级的多个语法元素的编码语法；generating a coding syntax comprising a plurality of syntax elements at a sequence level, a frame level, or a partition level;

根据所述编码语法将所述输入VDR图像和所述输入BL图像转换成BL数据和增强层(EL)数据；converting the input VDR image and the input BL image into BL data and enhancement layer (EL) data according to the coding syntax;

将所述编码语法转换成参考处理单元(RPU)数据；以及converting the coding syntax into reference processing unit (RPU) data; and

以BL信号、EL信号和RPU信号输出所述BL数据、所述EL数据和所述RPU数据，outputting the BL data, the EL data and the RPU data as a BL signal, an EL signal and an RPU signal,

被配置成执行以下操作的解码器：A decoder configured to:

以所述BL信号、所述EL信号和所述RPU信号接收所述BL数据、所述EL数据和所述RPU数据；receiving the BL data, the EL data, and the RPU data using the BL signal, the EL signal, and the RPU signal;

将所述RPU数据解码成包括序列级、帧级或分区级的所述多个语法元素的所述编码语法；decoding the RPU data into the coding syntax comprising the plurality of syntax elements at a sequence level, a frame level, or a partition level;

根据所述编码语法将所述BL数据和EL数据转换成重构的VDR图像；以及converting the BL data and the EL data into a reconstructed VDR image according to the coding syntax; and

输出所述重构的VDR图像。The reconstructed VDR image is output.

(47)一种存储软件指令的非暂态计算机可读介质，当由一个或更多个处理器执行时，所述软件指令使得执行根据(1)至(42)所述的方法中的任意方法的步骤。(47) A non-transitory computer-readable medium storing software instructions that, when executed by one or more processors, cause the steps of any of the methods described in (1) to (42) to be performed.

Claims

1. A decoder for generating high dynamic range images, wherein the decoder includes non-transitory memory and one or more processors, wherein generating an output image using the decoder includes:

Receive reference processing unit (RPU) data and store at least a portion of the RPU data in the non-transitory memory;

At least the RPU data header and the RPU data payload are extracted from the RPU data;

Receive the base layer image;

Receive the enhancement layer image;

Determine whether the decoder can use syntax elements from the previous RPU data payload;

When it is determined that the decoder cannot use syntax elements from the previous RPU data payload, the RPU data payload is parsed to extract syntax elements. When it is determined that the decoder can use syntax elements from the previous RPU data payload, the previous RPU ID value is decoded to predict the syntax elements, wherein the syntax elements include the current RPU ID value, a mapped color space indicator, a chroma resampling filter indicator, principal mapping parameters, and partitioning parameters; and

The base layer image and the enhancement layer image are combined based on the syntax elements to generate the output image.

The parsing of the RPU data payload includes:

Decode the current RPUId value;

Decode the mapped color space indicator;

Decode the chroma resampling filter indicator;

Decode the principal mapping parameters; and

The partition parameters are decoded.

2. The decoder according to claim 1, wherein the syntax elements include inter-layer prediction data and nonlinear quantizer (NLQ) data.

3. The decoder of claim 1, wherein determining whether the decoder can use syntax elements from a previous RPU data payload includes parsing the use_prev_vdr_rpu_flag flag, which, when set to 1, indicates that the syntax element is derived from a previously received RPU data payload, and when set to 0, indicates that the syntax element is explicitly derived.

4. The decoder according to claim 2, wherein the inter-layer prediction data is extracted according to the rpu_data_mapping() syntax structure, and the nonlinear quantizer NLQ data is extracted according to the rpu_data_nlq() syntax structure.

5. The decoder according to claim 4, wherein the rpu_data_mapping() syntax structure further includes:

The dynamic range of the output image is divided into multiple principal component indices for segments, where,

For one or more primary indices, the rpu_data_mapping() syntax structure includes the corresponding mapping index indicating the inverse mapping method; and wherein,

For one or more mapping methods, the rpu_data_mapping() syntax structure includes the corresponding mapping parameters.

6. The decoder according to claim 5, wherein the inverse mapping method includes one of the following: a polynomial prediction method, an MMR prediction method, or a lookup table prediction method.

7. The decoder according to claim 4, wherein the rpu_data_nlq() syntax structure further includes:

For one or more primary indices, the rpu_data_nlq() syntax structure includes the corresponding non-linear quantization method; and

For one or more nonlinear quantization methods, the rpu_data_nlq() syntax structure includes the corresponding nonlinear quantization parameters.

8. The decoder according to claim 7, wherein the nonlinear quantization method includes one of the following: linear dead zone method, μ-law method, Laplace operator method, or S-shaped method.

9. The decoder according to claim 8, wherein, for the linear dead-time method, the rpu_data_nlq() syntax structure further includes the decoder output level value and the decoder residual input maximum value.

10. The decoder according to claim 1, wherein the RPU data header comprises:

An RPU type flag indicating the encoding type, wherein the encoding type includes either 3D image encoding or high dynamic range image encoding;

An RPU format field indicating the version format of the RPU data payload;

RPU attribute fields; and

RPU level field.

11. The decoder of claim 10, wherein the RPU type flag includes an rpu_type flag, and when rpu_type = 2, the encoding type includes high dynamic range image encoding.

12. A method for generating a high dynamic range image using a decoder, wherein generating an output image using the decoder comprises:

Receive reference processing unit (RPU) data and store at least a portion of the RPU data in a non-transitory memory;

Receive the base layer image;

Receive the enhancement layer image;

The parsing of the RPU data payload includes:

Decode the current RPUId value;

Decode the mapped color space indicator;

Decode the chroma resampling filter indicator;

Decode the principal mapping parameters; and

The partition parameters are decoded.

13. A non-transitory computer-readable storage medium storing a computer program that can be executed by a processor to perform the following steps:

Receive the base layer image;

Receive the enhancement layer image;

The parsing of the RPU data payload includes:

Decode the current RPUId value;

Decode the mapped color space indicator;

Decode the chroma resampling filter indicator;

Decode the principal mapping parameters; and

The partition parameters are decoded.