KR102897092B1

KR102897092B1 - Method for video encoding and decoding based on intra prediction and appartus thereof

Info

Publication number: KR102897092B1
Application number: KR1020240138775A
Authority: KR
Inventors: 이선영; 류창우
Original assignee: 주식회사 아틴스; 가온그룹 주식회사
Priority date: 2023-10-19
Filing date: 2024-10-11
Publication date: 2025-12-09
Anticipated expiration: 2044-10-11
Also published as: KR20250056798A

Abstract

본 발명은 화면 내 예측 부호화/복호화 방법 중 IBC(intra block copy)에 기반한 개량된 부호화/복호화 방법 및 장치에 대한 것이다. 본 발명의 일 실시예에 따른 부호화된 동영상 비트열의 복호화 방법은, 상기 비트열로부터, 현재 복호화중인 픽쳐(picture)의 현재 블록(current block)의 예측 복호화(prediction decoding)를 위한 예측 모드(prediction mode) 정보를 읽어들이는 단계, 상기 예측 모드 정보가 인트라 블록 복사(intra block copy, IBC)에 기반하는 모드를 나타내는지 판단하는 단계, 상기 비트열로부터, 상기 예측 모드 정보에 기반하여, 상기 현재 복호화중인 픽쳐를 포함하는 적어도 하나의 픽쳐로부터 적어도 하나의 참조 블록(reference block)의 위치를 지정하는 적어도 하나의 예측 벡터(prediction vector)를 읽어들이는 단계, 상기 예측 벡터에 기반하여, 적어도 하나의 참조 블록(reference block)을 획득하는 단계, 상기 적어도 하나의 참조 블록에 기반하여, 예측 블록(prediction block)을 생성하는 단계, 및 상기 예측 블록에 기반하여 상기 현재 블록에 대한 예측 복호화를 수행하는 단계를 포함할 수 있다.The present invention relates to an improved encoding/decoding method and device based on IBC (intra block copy) among intra-screen prediction encoding/decoding methods. A method for decoding an encoded video bitstream according to an embodiment of the present invention may include the steps of: reading prediction mode information for prediction decoding of a current block of a picture currently being decoded from the bitstream; determining whether the prediction mode information indicates a mode based on intra block copy (IBC); reading, from the bitstream, at least one prediction vector designating a position of at least one reference block from at least one picture including the picture currently being decoded based on the prediction mode information; obtaining at least one reference block based on the prediction vector; generating a prediction block based on the at least one reference block; and performing prediction decoding on the current block based on the prediction block.

Description

Method and apparatus for video encoding and decoding based on intra-screen prediction

본 발명은 디지털 동영상(digital video)의 부호화(encoding) 및 복호화(decoding) 분야에 관한 것으로, 디지털 동영상 부호화 및 복호화를 위한 방법, 그러한 데이터를 기록하기 위한 방법, 및 그러한 방법을 실현하는 구성 요소, 장치, 및 시스템 등에 관한 것이다. 구체적으로, 본 발명은 화면 내 예측 부호화/복호화 방법 중 IBC(intra block copy)에 기반한 개량된 부호화/복호화 방법 및 장치에 대한 것이다.The present invention relates to the field of digital video encoding and decoding, and more particularly, to a method for encoding and decoding digital video, a method for recording such data, and components, devices, and systems for implementing such methods. Specifically, the present invention relates to an improved encoding/decoding method and device based on IBC (intra block copy) among intra-screen prediction encoding/decoding methods.

본 발명은 MPEG-2, MPEG-4 Video, H.263, H.264/AVC, H.265/HEVC, H.266/VVC, VC-1, AV1, QuickTime, VP-9, VP-10, Motion JPEG과 같은 규격명으로 알려진 디지털 동영상 압축 기술 규격 중 적어도 하나와 동일한 기술분야이거나, 그 규격의 내재 효율을 향상하는 데 대한 기술분야이거나, 또는 그 규격을 개량 또는 대체하는 데 대한 기술분야에 해당할 수 있다.The present invention may be in the same technical field as at least one of the digital video compression technology standards known by the names of standards such as MPEG-2, MPEG-4 Video, H.263, H.264/AVC, H.265/HEVC, H.266/VVC, VC-1, AV1, QuickTime, VP-9, VP-10, and Motion JPEG, or may be in the technical field for improving the inherent efficiency of the standards, or may be in the technical field for improving or replacing the standards.

디지털 동영상의 부호화 및 복호화는 다양한 디지털 동영상 응용분야에서 널리 활용되고 있다. 예를 들어, 디지털 텔레비전 방송, 통신 네트워크를 통한 영상의 전송, 화상 통화/화상 대화/화상 채팅, VCD(video compact disc)/DVD(digital versatile disc)/Blu-Ray를 포함하는 광학매체를 이용한 동영상 콘텐츠의 기록 및 제공, 동영상 콘텐츠의 제작, 편집, 수집, 및 배급 절차의 일체, 및 개인적 목적, 상업적 목적, 산업적 목적, 및 보안경비의 목적 등을 포함하는 다양한 이유로 실시되는 동영상 촬영 및 기록 행위를 위한 동영상 촬영 장치 및 캠코더와 같은 장치는, 모두 동영상 부호화 및 복호화 기술에 의존적이다.Digital video encoding and decoding are widely used in various digital video applications. For example, digital television broadcasting, video transmission over communication networks, video calls/video conversations/video chats, recording and providing video content using optical media including video compact discs (VCDs), digital versatile discs (DVDs), and Blu-Rays, all processes for producing, editing, collecting, and distributing video content, and devices such as video recording devices and camcorders for various purposes, including personal, commercial, industrial, and security purposes, all depend on video encoding and decoding technologies.

따라서 디지털 동영상 부호화기 및 복호화기로 칭해질 수 있는 구현물들은, 디지털 텔레비전, 디지털 방송 시스템, 무선 방송 시스템들, 노트북/데스크탑/태블릿 형태의 컴퓨터, 전자책 단말기, 디지털 카메라, 디지털 녹화 장치, 디지털 멀티미디어 재생 장치, 비디오 게임 장치/단말/콘솔, 멀티미디어 재생기능을 가준 휴대전화(스마트폰을 포함한다), 화상 회의를 위한 기재들, 및 그 밖의 디지털 동영상의 생성, 기록, 및 제공에 관련된 광범위한 장치의 일부분을 구성할 수 있다.Accordingly, implementations that may be referred to as digital video encoders and decoders may form part of a wide range of devices related to the generation, recording, and provision of digital video, including digital televisions, digital broadcasting systems, wireless broadcasting systems, computers in the form of notebooks/desktops/tablets, e-book readers, digital cameras, digital recording devices, digital multimedia playback devices, video game devices/terminals/consoles, mobile phones (including smartphones) with multimedia playback capabilities, equipment for video conferencing, and other devices.

상기와 같은 디지털 동영상 부호화기 및 복호화기들은 통상의 기술자에게 이해되어 널리 사용되는 디지털 동영상 압축 규격에 의하여 구현될 수 있다. 상기 디지털 동영상 압축 규격에는 MPEG-2, MPEG-4 Video, H.263, H.264/AVC, H.265/HEVC, H.266/VVC, VC-1, AV1, QuickTime, VP-9, VP-10, Motion JPEG과 같은 규격명으로 알려진 압축 규격 중 적어도 하나가 포함될 수 있다.The above digital video encoders and decoders can be implemented by a digital video compression standard that is widely used and understood by those skilled in the art. The digital video compression standard may include at least one of compression standards known by a standard name such as MPEG-2, MPEG-4 Video, H.263, H.264/AVC, H.265/HEVC, H.266/VVC, VC-1, AV1, QuickTime, VP-9, VP-10, and Motion JPEG.

동영상 부호화기 및 복호화기는 상기 규격을 준수하면서, 또는 상기 규격을 개량하거나 변형함으로써, 디지털 동영상 정보를 더욱 효율적으로 부호화 또는 복호화하도록 구현될 수 있다. 상기 규격의 변형 시도는 또한 새로운 규격의 도출로 이어질 수 있다. 널리 알려진 사례들 중에는 ISO, IEC, ITU-T의 공동 국제 표준화 집단인 JVET(joint video experts team)에 의하여 개발되고 있는 종래 H.266/VVC 규격의 개량 및 대체 시도인 이른바 개량압축모델(enhanced compression model; ECM)이 존재한다.Video encoders and decoders can be implemented to more efficiently encode or decode digital video information while complying with the above standards, or by improving or modifying the above standards. Attempts to modify the above standards can also lead to the development of new standards. A well-known example is the so-called enhanced compression model (ECM), an attempt to improve and replace the existing H.266/VVC standard, currently being developed by the Joint Video Experts Team (JVET), a joint international standardization group of ISO, IEC, and ITU-T.

영상압축 국제표준인 VVC(versatile video coding)은 화면 내 예측 부호화/복호화 방법으로 대상 블록의 예측 정보로 동일 화면 내 참조 블록을 사용하기 위해 대상 블록의 블록 정보를 유도하는 방법을 제공한다. 해당 기술은 컴퓨터 그래픽 처리된 스크린 컨텐츠에 유용한 압축 도구로 간주되며, IBC(intra block copy)로 통칭된다.Versatile Video Coding (VVC), an international standard for video compression, is an intra-frame predictive encoding/decoding method that derives block information from a target block to use reference blocks within the same frame as prediction information for that block. This technology is considered a useful compression tool for computer graphics-processed screen content and is collectively known as intra-block copy (IBC).

디지털 비디오는 압축되지 않은 상태에서 그 내용을 묘사하기 위해 소요되는 정보량이 상당한 것으로 알려져 있다. 따라서, 이러한 정보를 원본 그대로의 형태로 기록하거나 또는 전송하는 행위는 비효율적일 수 있다. 따라서, 디지털 비디오는 기록 또는 전송에 앞서 다양한 방법으로 압축된다. 상기 압축의 방법에는 손실 부호화의 방법과 무손실 부호화의 방법이 있다. 손실 부호화의 경우 화질을 일부 희생하여 높은 압축 성능을 달성하고, 무손실 부호화의 경우 화질의 열화를 방지하기 위하여 압축 성능을 일부 희생한다. 어느 방식의 부호화이든, 제한된 메모리 기록 용량 및 통신 전송 대역폭의 범주 내에서 높은 화질을 가지는 디지털 비디오를 이용하고자 하는 수요에 대응하기 위하여, 화질의 희생을 최소화하면서 높은 압축률을 획득하고자 하는 기술의 구현이 요구되고 있다.Digital video is known to require a significant amount of information to describe its content in its uncompressed state. Therefore, recording or transmitting this information in its original form can be inefficient. Therefore, digital video is compressed using various methods prior to recording or transmission. These compression methods include lossy and lossless encoding. Lossy encoding sacrifices some image quality to achieve high compression performance, while lossless encoding sacrifices some compression performance to prevent image quality degradation. Regardless of the encoding method, there is a need to implement a technology that achieves a high compression ratio while minimizing image quality degradation to meet the growing demand for high-quality digital video within the constraints of limited memory storage capacity and communication transmission bandwidth.

상기와 같이 압축을 위한 부호화 과정에는 다양한 연산이 소요되는데, 예를 들어 디지털 동영상의 공간적 분할, 색상 채널에서의 분할 및/또는 가공, 공간적 잔차(redundancy)의 제거, 시간적 잔차의 제거, 동영상 내에서의 움직임 벡터(motion vector)의 추적, 차분 영상(differential image)의 부호화, 양자화(quantization), 계수 스캔(coefficient scan), 연속 길이 부호화(run-length coding), 엔트로피 부호화(entropy coding), 및 루프 필터(loop filter)를 포함하는 일련의 과정들이 소요될 수 있다. 이러한 부호화 연산은 대체로 컴퓨팅 자원을 소모하며 완수에 일정 시간을 요한다. 마찬가지로, 상기 부호화 연산에 대한 복호화 연산 또한, 일정한 컴퓨팅 자원과 일정 시간을 요하게 된다. 상기 자원 소모와 시간의 소요가 디지털 동영상의 제작, 기록, 공급, 및 감상에 방해가 되지 않도록 하는 것이 동영상 부호화 및 복호화 기술의 주요 목적이 되고 있다.As described above, the encoding process for compression requires various operations, such as spatial segmentation of digital video, segmentation and/or processing in color channels, removal of spatial redundancy, removal of temporal redundancy, tracking of motion vectors within the video, encoding of differential images, quantization, coefficient scan, run-length coding, entropy coding, and loop filtering. These encoding operations generally consume computing resources and take a certain amount of time to complete. Similarly, the decoding operation for the encoding operation also requires certain computing resources and a certain amount of time. The main goal of video encoding and decoding technology is to ensure that the above resource consumption and time consumption do not interfere with the production, recording, distribution, and viewing of digital videos.

따라서 본 발명은 상기와 같은 비디오 부호화 및 복호화 분야의 기술적 과제에 있어서 부호화 효율의 개선, 복호화 효율의 개선, 동영상 화질의 개선, 연산량의 절감, 소프트웨어 규모의 축소, 하드웨어 규모의 축소, 및 그 밖에 부호화 및 복호화에 관련된 성능의 개선 중 적어도 하나 이상에 기여할 수 있는 새로운 기술을 제공한다.Accordingly, the present invention provides a new technology that can contribute to at least one of the following: improvement of encoding efficiency, improvement of decoding efficiency, improvement of video quality, reduction of computational amount, reduction of software size, reduction of hardware size, and improvement of other performances related to encoding and decoding in the technical tasks in the field of video encoding and decoding.

구체적으로, 본 발명은 IBC 기반의 부호화/복호화 방법의 효율성을 개선하기 위한 것이다. 특히, 예측 벡터(prediction vector), 예를 들어 블록 벡터(BV; block vector)의 정밀도를 향상시키고, 템플릿 매칭(TM)을 통한 예측 벡터의 재정의(refinement) 방법을 제공하며, 양방향 예측(bi-predictive) 모드의 제공 및 화면 내/화면 간 예측의 결합 모드를 제공함으로써 예측 정확도를 향상하기 위한 것이다.Specifically, the present invention aims to improve the efficiency of an IBC-based encoding/decoding method. In particular, the present invention aims to improve the precision of a prediction vector, for example, a block vector (BV), provide a method for refining a prediction vector through template matching (TM), and enhance prediction accuracy by providing a bi-predictive mode and a combined mode of intra- and inter-screen prediction.

본 발명은 상술한 비디오 부호화 및 복호화 영역에 사용될 수 있는 화면 내 예측 방법 중 하나로, 대상 블록의 예측 정보를 동일 화면 내 블록 벡터가 가리키는 위치로부터 획득하는 방법을 제시한다.The present invention is one of the intra-screen prediction methods that can be used in the above-described video encoding and decoding areas, and proposes a method of obtaining prediction information of a target block from a position indicated by a block vector within the same screen.

상술한 기술적 과제를 해결하기 위한 본 발명의 일 실시예에 따른 부호화된 동영상 비트열의 복호화 방법은, 상기 비트열로부터, 현재 복호화중인 픽쳐(picture)의 현재 블록(current block)의 예측 복호화(prediction decoding)를 위한 예측 모드(prediction mode) 정보를 읽어들이는 단계, 상기 예측 모드 정보가 인트라 블록 복사(intra block copy, IBC)에 기반하는 모드를 나타내는지 판단하는 단계, 상기 비트열로부터, 상기 예측 모드 정보에 기반하여, 상기 현재 복호화중인 픽쳐를 포함하는 적어도 하나의 픽쳐로부터 적어도 하나의 참조 블록(reference block)의 위치를 지정하는 적어도 하나의 예측 벡터(prediction vector)를 읽어들이는 단계, 상기 예측 벡터에 기반하여, 적어도 하나의 참조 블록(reference block)을 획득하는 단계, 상기 적어도 하나의 참조 블록에 기반하여, 예측 블록(prediction block)을 생성하는 단계, 및 상기 예측 블록에 기반하여 상기 현재 블록에 대한 예측 복호화를 수행하는 단계를 포함할 수 있다.According to an embodiment of the present invention for solving the above-described technical problem, a method for decoding an encoded video bitstream may include the steps of: reading prediction mode information for prediction decoding of a current block of a picture currently being decoded from the bitstream; determining whether the prediction mode information indicates a mode based on intra block copy (IBC); reading, from the bitstream, at least one prediction vector designating a position of at least one reference block from at least one picture including the picture currently being decoded based on the prediction mode information; obtaining at least one reference block based on the prediction vector; generating a prediction block based on the at least one reference block; and performing prediction decoding on the current block based on the prediction block.

상기 예측 벡터를 읽어들이는 단계는, 상기 비트열로부터 적응적 블록 벡터 해상도(adaptive block vector resolution, ABVR) 플래그 값을 획득하는 단계, 및 상기 ABVR 플래그 값이 제1 범위에 해당하는 경우, 블록 벡터의 정밀도(precision) 표현 정보를 읽어들이는 단계, 상기 ABVR 플래그 정보 및 상기 정밀도 표현 정보에 기반하여 상기 예측 벡터의 정밀도를 결정하는 단계, 및 상기 해상도에 기반하여 상기 예측 벡터를 읽어들이는 단계를 포함할 수 있다.The step of reading the prediction vector may include the step of obtaining an adaptive block vector resolution (ABVR) flag value from the bit string, the step of reading precision representation information of the block vector when the ABVR flag value falls within a first range, the step of determining the precision of the prediction vector based on the ABVR flag information and the precision representation information, and the step of reading the prediction vector based on the resolution.

상기 예측 벡터의 정밀도는, 부화소(sub-pixel)단위를 포함하는 적어도 하나의 정밀도 단위 중 하나로 결정되는 것을 특징으로 할 수 있다.The precision of the above prediction vector may be characterized in that it is determined by at least one precision unit including a sub-pixel unit.

상기 예측 벡터의 정밀도는, 상기 ABVR 플래그 값이 제1 범위에 해당하지 않는 경우 1/4화소 단위로, 상기 ABVR 플래그 값이 제1 범위에 해당하는 경우, 상기 정밀도 표현 정보의 값이 제1값인 경우 1화소 단위로, 상기 정밀도 표현 정보의 값이 제2값인 경우 4화소 단위로 결정되는 것을 특징으로 할 수 있다.The precision of the above prediction vector may be characterized in that it is determined in units of 1/4 pixels when the ABVR flag value does not fall within the first range, in units of 1 pixel when the ABVR flag value falls within the first range, in units of 1 pixel when the value of the precision representation information is the first value, and in units of 4 pixels when the value of the precision representation information is the second value.

상기 제1 범위는, 상기 ABVR 플래그 값이 "1"인 경우를 의미하는 것을 특징으로 하고, 상기 정밀도 표현 정보는, 적어도 두 가지 경우의 수를 가지는 인덱스(index)값인 것을 특징으로 할 수 있다.The above first range may be characterized by meaning a case where the ABVR flag value is “1”, and the precision expression information may be characterized by being an index value having at least two cases.

상기 방법은, 상기 예측 모드 정보가 예측 벡터의 재정의(refinement)를 사용하는 모드를 나타내는지 판단하는 단계, 및 상기 재정의를 사용하는 모드인 경우, 상기 예측 벡터를 보상 교정(compensation)하여 재정의하는 단계를 더 포함하고, 상기 참조 블록을 획득하는 단계는, 상기 재정의된 예측 벡터에 기반하여 동작할 수 있다.The method further includes a step of determining whether the prediction mode information indicates a mode that uses refinement of a prediction vector, and a step of redefining the prediction vector by compensation if the mode uses refinement, and the step of obtaining the reference block can operate based on the refined prediction vector.

상기 방법은, 상기 현재 블록을 둘 이상의 서브 블록(sub-block)으로 분할하는 단계를 더 포함하고, 상기 예측 벡터를 재정의하는 단계는, 상기 예측 벡터에 기반하여 상기 각각의 서브 블록에 대하여 개별적으로 예측 벡터를 보상 교정하는 것을 특징으로 하고, 상기 참조 블록을 획득하는 단계는, 상기 각각의 서브 블록에 대하여 개별적으로 재정의된 예측 벡터에 기반하여 획득되는 둘 이상의 서브 참조 블록(sub-reference-block)을 획득하고, 상기 각각의 서브 참조 블록을 결합하여 상기 참조 블록을 획득하는 것을 특징으로 할 수 있다.The method may further include a step of dividing the current block into two or more sub-blocks, wherein the step of redefining the prediction vector may be characterized by individually compensating and correcting the prediction vector for each sub-block based on the prediction vector, and the step of obtaining the reference block may be characterized by obtaining two or more sub-reference blocks obtained based on the individually redefined prediction vector for each sub-block, and combining the respective sub-reference blocks to obtain the reference block.

상기 예측 벡터를 재정의하는 단계는, 상기 예측 벡터가 가리키는 지점을 기준으로 하는 탐색 영역(search range)을 가지는 템플릿 매칭(template matching) 방법에 기반하여 동작하는 것을 특징으로 할 수 있다.The step of redefining the above prediction vector may be characterized in that it operates based on a template matching method having a search range based on a point pointed to by the above prediction vector.

상기 예측 벡터를 재정의하는 단계는, 상기 예측 벡터가 가지는 정밀도보다 높은 정밀도로 보상 교정하는 것을 특징으로 할 수 있다.The step of redefining the above prediction vector may be characterized by performing compensation correction with a higher precision than the precision of the above prediction vector.

상기 예측 벡터를 재정의하는 단계는, 정수 화소 단위로 읽어들인 예측 벡터를 부화소 단위로 보상 교정하는 것을 특징으로 할 수 있다.The step of redefining the above prediction vector may be characterized by compensating and correcting the prediction vector read in integer pixel units in subpixel units.

상기 방법은, 상기 예측 모드 정보가 두 개 이상의 예측 벡터를 허용하는 모드를 나타내는지 판단하는 단계를 더 포함하고, 상기 예측 벡터를 읽어들이는 단계는, 제1 예측 벡터를 읽어들이는 단계, 및 제2 예측 벡터를 읽어들이는 단계를 포함하고, 상기 참조 블록을 획득하는 단계는, 상기 제1 예측 벡터에 기반하여 제1 참조 블록을 획득하는 단계, 및 상기 제2 예측 벡터에 기반하여 제2 참조 블록을 획득하는 단계를 포함하고, 상기 예측 블록을 생성하는 단계는, 상기 제1 참조 블록과 상기 제2 참조 블록을 포함하는 둘 이상의 참조 블록에 기반하는 예측 융합(prediction fusion)을 통하여 예측 블록을 생성하는 것을 특징으로 할 수 있다.The method may further include a step of determining whether the prediction mode information indicates a mode that allows two or more prediction vectors, wherein the step of reading the prediction vector includes a step of reading a first prediction vector and a step of reading a second prediction vector, wherein the step of obtaining the reference block includes a step of obtaining a first reference block based on the first prediction vector and a step of obtaining a second reference block based on the second prediction vector, and wherein the step of generating the prediction block may be characterized in that the step of generating the prediction block generates the prediction block through prediction fusion based on two or more reference blocks including the first reference block and the second reference block.

상기 제1 예측 벡터 및 상기 제2 예측 벡터는 화면 내(intra) 예측을 위한 블록 벡터(block vector)이고, 상기 제1 예측 벡터와 상기 제2 예측 벡터는, 현재 픽쳐의 이미 복호화된 영역 내에서 서로 다른 참조 블록의 위치를 나타내도록 구성되는 것을 특징으로 할 수 있다.The first prediction vector and the second prediction vector are block vectors for intra-screen prediction, and the first prediction vector and the second prediction vector may be configured to indicate the positions of different reference blocks within an already decoded area of the current picture.

상기 제1 예측 벡터는 화면 내 예측을 위한 블록 벡터이고, 현재 픽쳐의 이미 복호화된 영역 내에서 상기 제1 참조 블록의 위치를 나타내도록 구성되고, 상기 제2 예측 벡터는 화면 간(inter) 예측을 위한 움직임 벡터(motion vector)이고, 이전에 복호화가 완료된 픽쳐 내에서 상기 제2 참조 블록의 위치를 나타내도록 구성되는 것을 특징으로 할 수 있다.The first prediction vector may be a block vector for intra-screen prediction and configured to indicate the position of the first reference block within an already decoded area of the current picture, and the second prediction vector may be a motion vector for inter-screen prediction and configured to indicate the position of the second reference block within a previously decoded picture.

상기 예측 융합은, 가중치의 조합에 의하여 수행되는 가중합산(weighted sum) 또는 가중평균(weighted average)에 의하여 실행되고, 상기 가중치에 대한 정보는, 상기 비트열로부터 획득되는 것을 특징으로 할 수 있다.The above prediction fusion may be performed by a weighted sum or weighted average performed by a combination of weights, and information about the weights may be obtained from the bit string.

상술한 기술적 과제를 해결하기 위한 본 발명의 일 실시예에 따른 동영상이 부호화된 비트열을 생성하는 부호화 방법은, 현재 부호화중인 픽쳐(picture)의 현재 블록(current block)에 대하여, 상기 현재 복호화중인 픽쳐를 포함하는 적어도 하나의 픽쳐로부터 적어도 하나의 참조 블록(reference block)의 위치를 지정하는 적어도 하나의 예측 벡터(prediction vector)를 결정하는 단계, 상기 예측 벡터에 기반하여, 적어도 하나의 참조 블록(reference block)을 획득하는 단계, 상기 적어도 하나의 참조 블록에 기반하여, 예측 블록(prediction block)을 생성하는 단계, 상기 예측 블록에 기반하여 상기 현재 블록에 대한 예측 부호화를 수행하는 단계, 상기 예측 부호화의 결과에 기반하여, 상기 현재 블록에 대한 예측 모드(prediction mode)를 인트라 블록 복사(intra block copy, IBC)에 기반하는 모드로 결정하는 단계, 상기 예측 부호화 모드에 기반하여, 상기 예측 모드 및 상기 적어도 하나의 예측 벡터를 나타내는 비트열 구문을 생성하는 단계, 및 상기 비트열 구문을 상기 비트열에 기록하는 단계를 포함할 수 있다.According to an embodiment of the present invention for solving the above-described technical problem, a method for encoding a moving picture to generate a bit stream may include the steps of: determining at least one prediction vector that designates a position of at least one reference block from at least one picture including the picture currently being decoded, for a current block of the picture currently being encoded; obtaining at least one reference block based on the prediction vector; generating a prediction block based on the at least one reference block; performing prediction encoding on the current block based on the prediction block; determining a prediction mode for the current block based on a result of the prediction encoding as a mode based on intra block copy (IBC); generating a bit stream syntax representing the prediction mode and the at least one prediction vector based on the prediction encoding mode; and recording the bit stream syntax in the bit stream.

상기 예측 벡터는, 부화소(sub-pixel)단위를 포함하는 적어도 하나의 정밀도에 의하여 결정되고, 상기 비트열 구문을 생성하는 단계는, 적응적 블록 벡터 해상도(adaptive block vector resolution, ABVR) 플래그 값 및 블록 벡터의 정밀도(precision) 표현 정보에 기반하여 상기 정밀도의 단위를 나타내는 비트열 구문을 생성하는 단계, 및 상기 정밀도의 단위에 기반하여 상기 예측 벡터를 나타내는 비트열 구문을 생성하는 단계를 포함할 수 있다.The above prediction vector is determined by at least one precision including a sub-pixel unit, and the step of generating the bit string syntax may include the step of generating a bit string syntax representing a unit of the precision based on an adaptive block vector resolution (ABVR) flag value and precision representation information of the block vector, and the step of generating a bit string syntax representing the prediction vector based on the unit of the precision.

상기 방법은, 상기 예측 벡터를 보상 교정(compensation)하는 단계를 더 포함하고, 상기 참조 블록을 획득하는 단계는, 상기 보상 교정된 예측 벡터에 기반하여 동작하는 것을 특징으로 하고, 상기 예측 모드를 결정하는 단계는, 상기 예측 모드를 예측 벡터의 재정의(refinement)를 사용하는 모드로 결정하는 것을 특징으로 하고, 상기 비트열 구문을 생성하는 단계는, 상기 보상 교정 이전의 예측 벡터를 나타내는 비트열 구문을 생성하는 것을 특징으로 할 수 있다.The method may further include a step of compensating the prediction vector, wherein the step of obtaining the reference block is characterized by operating based on the compensated-corrected prediction vector, the step of determining the prediction mode is characterized by determining the prediction mode as a mode that uses refinement of the prediction vector, and the step of generating the bit string syntax may be characterized by generating a bit string syntax representing the prediction vector before the compensation correction.

상기 예측 벡터를 결정하는 단계는, 제1 예측 벡터를 결정하는 단계, 및 제2 예측 벡터를 결정하는 단계를 포함하고, 상기 참조 블록을 획득하는 단계는, 상기 제1 예측 벡터에 기반하여 제1 참조 블록을 획득하는 단계, 및 상기 제2 예측 벡터에 기반하여 제2 참조 블록을 획득하는 단계를 포함하고, 상기 예측 블록을 생성하는 단계는, 상기 제1 참조 블록과 상기 제2 참조 블록을 포함하는 둘 이상의 참조 블록에 기반하는 예측 융합(prediction fusion)을 통하여 예측 블록을 생성하는 것을 특징으로 하고, 상기 예측 모드를 결정하는 단계는, 상기 예측 모드를 두 개 이상의 예측 벡터를 허용하는 모드로 결정하는 것을 특징으로 할 수 있다.The step of determining the prediction vector may include a step of determining a first prediction vector and a step of determining a second prediction vector, the step of obtaining the reference block may include a step of obtaining a first reference block based on the first prediction vector, and a step of obtaining a second reference block based on the second prediction vector, the step of generating the prediction block may be characterized in that the prediction block is generated through prediction fusion based on two or more reference blocks including the first reference block and the second reference block, and the step of determining the prediction mode may be characterized in that the prediction mode is determined as a mode that allows two or more prediction vectors.

상기 제1 예측 벡터는 화면 내 예측을 위한 블록 벡터이고, 현재 픽쳐의 이미 부호화된 영역 내에서 상기 제1 참조 블록의 위치를 나타내도록 구성되고, 상기 제2 예측 벡터는 화면 간(inter) 예측을 위한 움직임 벡터(motion vector)이고, 이전에 부호화가 완료된 픽쳐 내에서 상기 제2 참조 블록의 위치를 나타내도록 구성되는 것을 특징으로 할 수 있다.The first prediction vector may be a block vector for intra-screen prediction and configured to indicate the position of the first reference block within an already encoded region of the current picture, and the second prediction vector may be a motion vector for inter-screen prediction and configured to indicate the position of the second reference block within a picture whose encoding has been completed previously.

상술한 기술적 과제를 해결하기 위한 본 발명의 일 실시예에 따른 컴퓨터 장치에 의하여 부호화된 동영상 비트열을 복호화하도록 구성되는 복호화기 장치는, 프로세서, 메모리, 상기 비트열이 입력되는 입력부, 복호화된 동영상을 출력하는 출력부, 현재 복호화중인 픽쳐(picture)를 포함하여 적어도 하나의 복호화된 픽쳐의 정보를 저장하는 참조 버퍼(reference buffer), 상기 비트열로부터, 상기 현재 복호화중인 픽쳐의 현재 블록(current block)의 예측 복호화(prediction decoding)를 위한 예측 모드(prediction mode) 정보를 읽어들이고, 상기 예측 모드 정보가 인트라 블록 복사(intra block copy, IBC)에 기반하는 모드를 나타내는지 판단하고, 상기 비트열로부터, 상기 예측 모드 정보에 기반하여, 적어도 하나의 상기 픽쳐로부터 적어도 하나의 참조 블록(reference block)의 위치를 지정하는 적어도 하나의 예측 벡터(prediction vector)를 읽어들이는 기능을 포함하는, 비트열 파싱부, 상기 예측 벡터에 기반하여, 상기 참조 버퍼에 저장된 적어도 하나의 픽쳐로부터 적어도 하나의 참조 블록(reference block)을 획득하고, 상기 적어도 하나의 참조 블록에 기반하여, 예측 블록(prediction block)을 생성하고, 그리고 상기 예측 블록에 기반하여 상기 현재 블록에 대한 예측 복호화를 수행하는 기능을 포함하는, 예측 복호화부, 및 상기 예측 복호화 결과에 기반하여 상기 비트열을 복호화하도록 구성되는, 동영상 복호화부를 포함할 수 있다.According to an embodiment of the present invention for solving the above-described technical problem, a decoder device configured to decode a video bit stream encoded by a computer device comprises: a processor, a memory, an input unit for inputting the bit stream, an output unit for outputting a decoded video, a reference buffer for storing information of at least one decoded picture including a picture currently being decoded, a bit stream parsing unit having a function of reading prediction mode information for prediction decoding of a current block of the picture currently being decoded from the bit stream, determining whether the prediction mode information indicates a mode based on intra block copy (IBC), and reading at least one prediction vector designating a position of at least one reference block from at least one picture based on the prediction mode information, and obtaining at least one reference block from at least one picture stored in the reference buffer based on the prediction vector. The apparatus may include a prediction decoding unit, which includes a function of generating a prediction block based on at least one reference block, and performing prediction decoding on the current block based on the prediction block, and a video decoding unit configured to decode the bit string based on a result of the prediction decoding.

본 발명은 화면 내 예측 방법 중 하나로, 대상 블록의 예측 정보를 동일 화면 내 블록 벡터가 가리키는 위치로부터 획득하는 방법인 IBC 기반의 부호화/복호화 방법에 대한 개량된 발명을 제공한다. 본 발명에 따르면, 복호화 대상 블록의 블록 벡터 정보를 산정 및 전송하는 방법과 및 복호화 과정 중에 해당 블록 벡터를 재정의하는 방법을 통해 더 정확한 움직임 정보를 산정하여, 대상 블록의 압축 성능을 향상시키는 효과를 획득할 수 있다.The present invention provides an improved invention for an encoding/decoding method based on IBC, which is one of the intra-screen prediction methods, and is a method of obtaining prediction information of a target block from a position indicated by a block vector within the same screen. According to the present invention, through a method of calculating and transmitting block vector information of a block to be decoded and a method of redefining the corresponding block vector during the decoding process, more accurate motion information can be calculated, thereby obtaining the effect of improving the compression performance of the target block.

보다 구체적으로, 본 발명에 따르면, 예측 벡터, 예를 들어 블록 벡터(BV)의 해상도를 부화소 단위로 정밀화함으로써, 기존의 정수 화소 단위보다 더 정밀한 예측을 가능하게 하는 효과가 있다. 또한, 템플릿 매칭 기반의 예측 벡터 재정의 방법을 IBC 스킵/머지 모드와 IBC AMVP 모드 모두에 적용함으로써, 복호화 과정에서 상기 예측 벡터의 정확도를 개선하여 예측 성능을 향상하는 효과가 있다. 또한, 서브 블록 단위의 예측 벡터 재정의 방법을 통해, 블록 내 지역적 특성을 반영한 더욱 정교한 예측을 가능하게 하는 효과가 있다. 또한, IBC에 양방향 예측(bi-predictive) 개념을 도입하여, 두 개의 참조 블록을 활용한 가중 예측을 통해 단방향 예측(uni-predictive)보다 향상된 예측 성능을 제공하는 효과가 있다. 또한, IBC 기반의 화면 내 예측과 화면 간 예측을 결합한 확장된 양방향 예측 방법을 통해, 현재 픽쳐와 참조 픽쳐의 정보를 동시에 활용하여 더욱 효과적인 예측을 가능하게 하는 효과가 있다. More specifically, according to the present invention, by refining the resolution of a prediction vector, for example, a block vector (BV), to a sub-pixel level, it has the effect of enabling more precise prediction than the existing integer pixel level. In addition, by applying a template matching-based prediction vector redefinition method to both the IBC skip/merge mode and the IBC AMVP mode, it has the effect of improving the accuracy of the prediction vector during the decoding process, thereby enhancing the prediction performance. In addition, through a sub-block level prediction vector redefinition method, it has the effect of enabling more precise prediction that reflects local characteristics within a block. In addition, by introducing the concept of bi-predictive prediction to IBC, it has the effect of providing improved prediction performance than uni-predictive prediction through weighted prediction using two reference blocks. In addition, through an extended bi-predictive method that combines IBC-based intra-picture prediction and inter-picture prediction, it has the effect of enabling more effective prediction by simultaneously utilizing information from the current picture and reference pictures.

본 발명에 따르면, 상술한 바와 같은 기술적 효과에 기반하여, IBC 기반의 부호화/복호화 방법 전반의 효율을 향상시키고, 특히 스크린 컨텐츠와 같은 특수한 영상에 대해 더욱 효과적인 부호화를 달성할 수 있는 효과가 있다.According to the present invention, based on the technical effects described above, there is an effect of improving the overall efficiency of the IBC-based encoding/decoding method, and achieving more effective encoding, especially for special images such as screen contents.

도 1은 본 발명의 일 실시예에 따른 비디오 통신 시스템의 개념도,
도 2는 본 발명의 일 실시예에 따른 실시간 비디오 스트리밍 환경에서의 부호화기 및 복호화기 배치에 대한 개념도,
도 3은 본 발명의 일 실시예에 따른 비디오 복호화기의 기능부 단위 개념도,
도 4는 본 발명의 일 실시예에 따른 비디오 부호화기의 기능부 단위 개념도,
도 5는 본 발명의 일 실시예에 의한 프레임 유형의 개념도,
도 6은 본 발명의 다른 실시예에 의한 비디오 부호화기의 구조를 나타내는 개념도,
도 7은 본 발명의 일 실시예에 따른 IBC(intra block copy)의 개념도,
도 8은 본 발명의 일 실시예에 따른 템플릿 매칭에 기반한 예측 벡터 재정의에 대한 개념도,
도 9는 본 발명의 일 실시예에 따른 예측 벡터의 재정의 과정을 나타내는 흐름도,
도 10은 본 발명의 일 실시예에 따른 서브 블록 기반의 예측 벡터 재정의에 대한 개념도,
도 11은 본 발명의 일 실시예에 따른 양방향 예측에 의한 예측 방법을 나타내는 개념도,
도 12는 본 발명의 일 실시예에 따른 방향성 플래그 정보의 처리에 대한 흐름도, 그리고
도 13은 본 발명의 일 실시예에 따른 확장된 양방향 예측에 의한 예측 방법을 나타내는 개념도이다.Figure 1 is a conceptual diagram of a video communication system according to one embodiment of the present invention;
FIG. 2 is a conceptual diagram of the arrangement of an encoder and decoder in a real-time video streaming environment according to one embodiment of the present invention.
Figure 3 is a functional unit conceptual diagram of a video decoder according to one embodiment of the present invention;
Figure 4 is a functional unit conceptual diagram of a video encoder according to one embodiment of the present invention;
Figure 5 is a conceptual diagram of a frame type according to one embodiment of the present invention;
Figure 6 is a conceptual diagram showing the structure of a video encoder according to another embodiment of the present invention.
Figure 7 is a conceptual diagram of IBC (intra block copy) according to one embodiment of the present invention.
FIG. 8 is a conceptual diagram for redefining a prediction vector based on template matching according to one embodiment of the present invention.
FIG. 9 is a flowchart showing a process of redefining a prediction vector according to one embodiment of the present invention;
FIG. 10 is a conceptual diagram for sub-block-based prediction vector redefinition according to one embodiment of the present invention;
Figure 11 is a conceptual diagram showing a prediction method using bidirectional prediction according to one embodiment of the present invention.
FIG. 12 is a flowchart for processing directional flag information according to one embodiment of the present invention, and
Fig. 13 is a conceptual diagram illustrating a prediction method using extended bidirectional prediction according to one embodiment of the present invention.

본 발명은 다양한 변경을 가할 수 있고 여러 가지 실시예를 가질 수 있는 바, 특정 실시예들을 도면에 예시하고 상세하게 설명하고자 한다. 그러나, 이는 본 발명을 특정한 실시 형태에 대해 한정하려는 것이 아니며, 본 발명의 사상 및 기술 범위에 포함되는 모든 변경, 균등물 내지 대체물을 포함하는 것으로 이해되어야 한다.The present invention is susceptible to various modifications and embodiments. Specific embodiments are illustrated and described in detail in the drawings. However, this is not intended to limit the present invention to specific embodiments, but rather to encompass all modifications, equivalents, and alternatives falling within the spirit and technical scope of the present invention.

제 1, 제 2 등의 용어는 다양한 구성요소들을 설명하는데 사용될 수 있지만, 상기 구성요소들은 상기 용어들에 의해 한정되어서는 안 된다. 상기 용어들은 하나의 구성요소를 다른 구성요소로부터 구별하는 목적으로만 사용된다. 예를 들어, 본 발명의 권리 범위를 벗어나지 않으면서 제 1 구성요소는 제 2 구성요소로 명명될 수 있고, 유사하게 제 2 구성요소도 제 1 구성요소로 명명될 수 있다. "및/또는"이라는 용어는 복수의 관련된 기재된 항목들의 조합 또는 복수의 관련된 기재된 항목들 중의 어느 항목을 포함하며, 또한, 달리 지시되지 않는 한 비배타적이다. 본 출원에 항목을 열거하는 경우 그것은 본 출원 발명의 사상과 가능한 실시 방법들을 용이하게 설명하기 위한 예시적 서술에 그치며, 따라서, 본 발명의 실시예 범위를 한정하는 의도를 가지지 아니한다.While terms such as "first" and "second" may be used to describe various components, these components should not be limited by these terms. These terms are used solely to distinguish one component from another. For example, without departing from the scope of the present invention, the first component could be referred to as the "second component," and similarly, the second component could also be referred to as the "first component." The term "and/or" includes any combination of multiple related items described herein or any of multiple related items described herein, and is non-exclusive unless otherwise indicated. The listing of items in this application is merely an exemplary description to facilitate the spirit and possible implementation methods of the present invention, and therefore is not intended to limit the scope of embodiments of the present invention.

본 명세서에서 "A 또는 B(A or B)"는 "오직 A", "오직 B" 또는 "A와 B 모두"를 의미할 수 있다. 달리 표현하면, 본 명세서에서 "A 또는 B(A or B)"는 "A 및/또는 B(A and/or B)"으로 해석될 수 있다. 예를 들어, 본 명세서에서 "A, B 또는 C(A, B or C)"는 "오직 A", "오직 B", "오직 C", 또는 "A, B 및 C의 임의의 모든 조합(any combination of A, B and C)"를 의미할 수 있다.As used herein, "A or B" can mean "only A," "only B," or "both A and B." In other words, as used herein, "A or B" can be interpreted as "A and/or B." For example, as used herein, "A, B or C" can mean "only A," "only B," "only C," or "any combination of A, B and C."

본 명세서에서 사용되는 슬래쉬(/)나 쉼표(comma)는 "및/또는(and/or)"을 의미할 수 있다. 예를 들어, "A/B"는 "A 및/또는 B"를 의미할 수 있다. 이에 따라 "A/B"는 "오직 A", "오직 B", 또는 "A와 B 모두"를 의미할 수 있다. 예를 들어, "A, B, C"는 "A, B 또는 C"를 의미할 수 있다.As used herein, a slash (/) or a comma can mean "and/or." For example, "A/B" can mean "A and/or B." Accordingly, "A/B" can mean "only A," "only B," or "both A and B." For example, "A, B, C" can mean "A, B, or C."

본 명세서에서 "적어도 하나의 A 및 B(at least one of A and B)"는, "오직 A", "오직 B" 또는 "A와 B 모두"를 의미할 수 있다. 또한, 본 명세서에서 "적어도 하나의 A 또는 B(at least one of A or B)"나 "적어도 하나의 A 및/또는 B(at least one of A and/or B)"라는 표현은 "적어도 하나의 A 및 B(at least one of A and B)"와 동일하게 해석될 수 있다. In this specification, "at least one of A and B" may mean "only A", "only B" or "both A and B". Additionally, in this specification, the expressions "at least one of A or B" or "at least one of A and/or B" may be interpreted identically to "at least one of A and B".

또한, 본 명세서에서 "적어도 하나의 A, B 및 C(at least one of A, B and C)"는, "오직 A", "오직 B", "오직 C", 또는 "A, B 및 C의 임의의 모든 조합(any combination of A, B and C)"를 의미할 수 있다. 또한, "적어도 하나의 A, B 또는 C(at least one of A, B or C)"나 "적어도 하나의 A, B 및/또는 C(at least one of A, B and/or C)"는 "적어도 하나의 A, B 및 C(at least one of A, B and C)"를 의미할 수 있다.Additionally, in this specification, “at least one of A, B and C” can mean “only A,” “only B,” “only C,” or “any combination of A, B and C.” Additionally, “at least one of A, B or C” or “at least one of A, B and/or C” can mean “at least one of A, B and C.”

어떤 구성요소가 다른 구성요소에 "연결되어" 있다거나 "접속되어" 있다고 언급된 때에는, 그 다른 구성요소에 직접적으로 연결되어 있거나 또는 접속되어 있을 수도 있지만, 중간에 다른 구성요소가 존재할 수도 있다고 이해되어야 할 것이다. 반면에, 어떤 구성요소가 다른 구성요소에 "직접 연결되어" 있다거나 "직접 접속되어" 있다고 언급된 때에는, 중간에 다른 구성요소가 존재하지 않는 것으로 이해되어야 할 것이다.When a component is referred to as being "connected" or "connected" to another component, it should be understood that it may be directly connected or connected to that other component, but that there may be other components intervening. Conversely, when a component is referred to as being "directly connected" or "connected" to another component, it should be understood that there are no other components intervening.

본 출원에서 사용한 용어는 단지 특정한 실시예를 설명하기 위해 사용된 것으로, 본 발명을 한정하려는 의도가 아니다. 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. 본 출원에서, "포함하다" 또는 "가지다" 등의 용어는 명세서상에 기재된 특징, 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것이 존재함을 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.The terminology used in this application is only used to describe specific embodiments and is not intended to limit the present invention. The singular expression includes the plural expression unless the context clearly indicates otherwise. In this application, it should be understood that the terms "comprise" or "have" indicate the presence of a feature, number, step, operation, component, part, or combination thereof described in the specification, but do not exclude in advance the possibility of the presence or addition of one or more other features, numbers, steps, operations, components, parts, or combinations thereof.

다르게 정의되지 않는 한, 기술적이거나 과학적인 용어를 포함해서 여기서 사용되는 모든 용어들은 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 것과 동일한 의미를 이용하여 있다. 일반적으로 사용되는 사전에 정의되어 있는 것과 같은 용어들은 관련 기술의 문맥상 가지는 의미와 일치하는 의미를 가진 것으로 해석되어야 하며, 본 출원에서 명백하게 정의하지 않는 한, 이상적이거나 과도하게 형식적인 의미로 해석되지 않는다.Unless otherwise defined, all terms used herein, including technical or scientific terms, have the same meaning as commonly understood by those of ordinary skill in the art to which this invention pertains. Terms defined in commonly used dictionaries should be interpreted as having a meaning consistent with their meaning within the context of the relevant technology, and shall not be interpreted in an idealized or overly formal sense unless explicitly defined herein.

본 출원에서 발명을 설명함에 있어, 실시예들은 설명된 기능 또는 기능들을 수행하는 단위 블록들의 측면에서 설명되거나 예시될 수 있다. 상기 블록들이란 본 출원에서 하나 또는 복수의 장치, 유닛, 모듈, 부 등으로 표현될 수 있다. 상기 블록들은 하나 또는 복수의 논리 게이트, 집적 회로, 프로세서, 컨트롤러, 메모리, 전자 부품 또는 이에 한정되지 않는 정보처리 하드웨어의 구현 방법에 의하여 하드웨어적으로 실시될 수도 있다. 또는, 상기 블록들은 응용 소프트웨어, 운영 체제 소프트웨어, 펌웨어, 또는 이에 한정되지 않는 정보처리 소프트웨어의 구현 방법에 의하여 소프트웨어적으로 실시될 수도 있다. 하나의 블록은 동일한 기능을 수행하는 복수의 블록들로 분리되어 실시될 수도 있으며, 반대로 복수의 블록들의 기능을 동시에 수행하기 위한 하나의 블록이 실시될 수도 있다. 상기 블록들은 또한 임의의 기준에 의하여 물리적으로 분리되거나 결합되어 실시될 수 있다. 상기 블록들은 통신 네트워크, 인터넷, 클라우드 서비스, 또는 이에 한정되지 않는 통신 방법에 의해 물리적 위치가 특정되지 않고 서로 이격되어 있는 환경에서 동작하도록 실시될 수도 있다. 상기의 모든 실시 방법은 동일한 기술적 사상을 구현하기 위하여 정보통신 기술 분야에 익숙한 통상의 기술자가 취할 수 있는 다양한 실시예의 영역이므로, 여하의 상세한 구현 방법은 모두 본 출원상 발명의 기술적 사상 영역에 포함되는 것으로 해석되어야 한다.In describing the invention in this application, embodiments may be described or illustrated in terms of unit blocks that perform the described function or functions. The blocks may be expressed in this application as one or more devices, units, modules, parts, etc. The blocks may be implemented in hardware by one or more logic gates, integrated circuits, processors, controllers, memories, electronic components, or information processing hardware implementation methods, but not limited thereto. Alternatively, the blocks may be implemented in software by application software, operating system software, firmware, or information processing software implementation methods, but not limited thereto. A single block may be implemented by being separated into multiple blocks that perform the same function, or conversely, a single block may be implemented to perform the functions of multiple blocks simultaneously. The blocks may also be implemented by being physically separated or combined according to any criteria. The blocks may also be implemented to operate in an environment where their physical locations are not specified and are separated from each other by a communication network, the Internet, a cloud service, or a communication method, but not limited thereto. All of the above implementation methods are within the realm of various embodiments that can be taken by a person skilled in the field of information and communication technology to implement the same technical idea, and therefore, any detailed implementation method should be interpreted as being included within the technical idea scope of the invention of the present application.

이하, 첨부한 도면들을 참조하여, 본 발명의 바람직한 실시예를 보다 상세하게 설명하고자 한다. 본 발명을 설명함에 있어 전체적인 이해를 용이하게 하기 위하여 도면상의 동일한 구성요소에 대해서는 동일한 참조부호를 사용하고 동일한 구성요소에 대해서 중복된 설명은 생략한다. 또한 복수의 실시예들은 서로 배타적이 아니며, 일부 실시예들이 새로운 실시예들을 형성하기 위해 하나 이상의 다른 실시예들과 조합될 수 있음을 전제로 한다.Hereinafter, preferred embodiments of the present invention will be described in more detail with reference to the attached drawings. To facilitate a comprehensive understanding of the present invention, the same reference numerals will be used for identical components in the drawings, and redundant descriptions of identical components will be omitted. Furthermore, the multiple embodiments are not mutually exclusive, and it is assumed that some embodiments may be combined with one or more other embodiments to form new embodiments.

디지털 동영상 코덱digital video codec

도 1은 본 발명의 일 실시예에 따른 비디오 통신 시스템의 개념도이다. 상기 비디오 통신 시스템(100)은 네트워크(105)를 통해 서로 연결되는 적어도 둘 이상의 단말(110, 120)을 포함하여 구성될 수 있다.Figure 1 is a conceptual diagram of a video communication system according to one embodiment of the present invention. The video communication system (100) may be configured to include at least two terminals (110, 120) connected to each other via a network (105).

본 발명의 일 실시예에 있어서, 상기 도 1은 단방향 비디오 통신 네트워크를 구성하는 데 대한 블록도를 의미할 수 있다. 상기 단말 중 제1 단말(110)은 네트워크(105)를 통해 비디오 데이터를 송신(111)하기 위하여 상기 비디오 데이터를 부호화할 수 있다. 상기 단말 중 제2 단말(120)은 네트워크를 통해 상기 부호화된 비디오 데이터를 수신(121)하고, 이를 복호화하여 표시하도록 구성될 수 있다.In one embodiment of the present invention, the above-described FIG. 1 may refer to a block diagram for configuring a unidirectional video communication network. A first terminal (110) among the terminals may encode video data in order to transmit (111) the video data via a network (105). A second terminal (120) among the terminals may be configured to receive (121) the encoded video data via a network and decode and display the same.

본 발명의 다른 일 실시예에 있어서, 상기 도 1은 양방향 비디오 통신 네트워크를 구성하는 데 대한 블록도를 의미할 수 있다. 상기 양방향 비디오 통신을 위해, 상기 각각의 단말기(110, 120)는 네트워크를 경유하는 다른 각각의 단말기에 대한 비디오 송신(112, 122)을 위하여 스스로 획득한 비디오 데이터를 부호화하도록 구성될 수 있다. 각각의 단말은 또한 다른 단말에 의해 네트워크를 통해 송신된 비디오 데이터를 수신(113, 123)하여 복호화할 수 있고, 상기 복호화된 비디오 데이터를 표시하도록 구성될 수 있다.In another embodiment of the present invention, the above-described FIG. 1 may refer to a block diagram for configuring a two-way video communication network. For the two-way video communication, each terminal (110, 120) may be configured to encode video data acquired by itself for video transmission (112, 122) to each other terminal via the network. Each terminal may also be configured to receive (113, 123) video data transmitted by another terminal via the network, decode the same, and display the decoded video data.

도 1에 나타나는 상기 각 단말(110, 120)은, 실시예에 따라서, 서버 컴퓨터, 개인용 컴퓨터, 휴대용 컴퓨터, 및 스마트폰과 같은 장치로 예시될 수 있으나, 이에 한정되지는 아니한다. 본 발명은 단방향 또는 양방향 비디오 통신 네트워크를 조성하기 위한 모든 환경에 적용되는 것으로써, 상기 네트워크(105)는 상기 단말기들(110, 120) 간에 부호화된 비디오 데이터를 운반하기 위한 여하의 수단으로 조성될 수 있다고 보아야 한다.The terminals (110, 120) shown in Fig. 1 may be exemplified as devices such as server computers, personal computers, portable computers, and smartphones, depending on the embodiment, but are not limited thereto. The present invention is applicable to all environments for establishing a one-way or two-way video communication network, and it should be understood that the network (105) may be established by any means for transporting encoded video data between the terminals (110, 120).

본 발명의 일 실시예에서, 상기 네트워크(105)는 유선 또는 무선의 통신 네트워크를 의미할 수 있다. 이 때 실시예에 따라서, 상기 네트워크는 임의의 통신 규격으로 정보를 소통하도록 구성될 수 있고, 상기 통신 규격에는 패킷 기반 통신이 포함될 수 있다. 상기 패킷 통신이란, 예를 들어, 이른바 TCP 또는 UDP로 알려진 패킷을 포함하는 의미로 이해될 수 있다.In one embodiment of the present invention, the network (105) may refer to a wired or wireless communication network. Depending on the embodiment, the network may be configured to communicate information using any communication standard, which may include packet-based communication. The packet communication may be understood to include packets, for example, known as TCP or UDP.

그러나 본 발명의 다른 실시예에서는, 상기 네트워크(105)는 기록매체를 이용한 정보 전달의 과정을 포함하는 의미로 이해될 수도 있다. 이러한 경우 상기 네트워크의 구성은 비단 통신 매체에 한정되지 아니하며, 하드 디스크, 솔리드 스테이트 디스크(SSD), 플래시 메모리, CD(compact disc), DVD(digital versatile disc), 블루레이(Blu-ray) 디스크, 및 그 밖의 기계적, 전자적, 또는 광학적 기록매체에 일시적으로 저장되어 물리적으로 운반되는 과정을 포함하는 것으로 이해되어야 할 것이다.However, in another embodiment of the present invention, the network (105) may be understood to include a process of information transmission using a recording medium. In this case, the configuration of the network is not limited to a communication medium, and should be understood to include a process of temporarily storing and physically transporting information on a hard disk, a solid state disk (SSD), a flash memory, a compact disc (CD), a digital versatile disc (DVD), a Blu-ray disc, and other mechanical, electronic, or optical recording media.

그 밖에 어떠한 정보 통신 또는 운반의 수단이 적용되더라도, 비디오 데이터를 부호화한 상태로 전달하여 복호화하도록 지원하는 구조에 있는 한 본 발명의 실시예 범주에 속한다고 볼 수 있다. 따라서 상기 열거된 일부 예시들 외에도 종래에 알려져 있거나 또는 새로이 제공될 수 있는 모든 정보 통신 또는 운반의 수단이 본 발명의 응용 범위에 속할 수 있다.Any other means of information communication or transport, regardless of the method employed, can be considered within the scope of the present invention as long as it has a structure that supports the transmission and decoding of video data in an encoded state. Therefore, in addition to the examples listed above, any means of information communication or transport, whether known in the past or newly available, can fall within the scope of application of the present invention.

도 2는 본 발명의 일 실시예에 따른 실시간 비디오 스트리밍 환경에서의 부호화기 및 복호화기 배치에 대한 개념도이다. 도 2가 예시하는 스트리밍 시스템(200)은, 예를 들어, 디지털 방송, 화상 전화, 화상 회의를 포함하는 비디오 데이터 통신 네트워크에 적용된다고 볼 수 있다. 단, 상기 스트리밍 시스템과 동일 또는 유사한 기술적 구조가, 상술한 바와 같이, 기록매체를 경유하는 정보 운반이 수반되는 경우에도 동일하게 적용될 수 있다고 보아야 할 것이다.FIG. 2 is a conceptual diagram illustrating the arrangement of an encoder and decoder in a real-time video streaming environment according to one embodiment of the present invention. The streaming system (200) illustrated in FIG. 2 can be applied to video data communication networks, including, for example, digital broadcasting, video telephony, and video conferencing. However, it should be noted that technical structures identical or similar to the streaming system can be equally applied even when information is transmitted via a recording medium, as described above.

본 발명의 일 실시예에 따르면, 상기 스트리밍 시스템은 비디오 스트림을 생성하는 비디오 소스(210)를 포함할 수 있다. 상기 비디오 소스는, 압축되지 않은 원본(raw) 비디오를 취득하는, 예를 들어 디지털 카메라나 그 밖의 기재로 구성될 수 있는 디지털 비디오 취득 수단(212)을 포함할 수 있다. 상기 원본 비디오 스트림(215)은 막대한 용량을 지닐 수 있으며, 따라서 상기 비디오 소스에 결합 또는 연결되어 있는 비디오 부호화기(217)에 의하여 압축 처리될 수 있다.According to one embodiment of the present invention, the streaming system may include a video source (210) that generates a video stream. The video source may include a digital video acquisition means (212), which may be, for example, a digital camera or other device, for acquiring uncompressed raw video. The raw video stream (215) may have a large capacity and may therefore be compressed by a video encoder (217) coupled or connected to the video source.

상기 부호화기(217)는 본 발명의 일 실시예에 따른 영상 부호화 방법 및/또는 그 실시방법을 구현하기 위하여 조성된 하드웨어, 소프트웨어, 또는 상기 양자의 조합을 포함하는 수단으로 구성될 수 있다.The above encoder (217) may be configured as a means including hardware, software, or a combination of the two, configured to implement an image encoding method and/or an implementation method thereof according to one embodiment of the present invention.

상기 부호화기(217)를 경유하여, 상기 원본 비디오 스트림에 비하여 용량이 줄어든 부호화된 비트열(bitstream)(219)이 출력될 수 있다. 상기 비트열(219)은 예를 들어 스트리밍 서버(220)로 호칭될 수 있는 중계 장치에 의하여 실시간으로 통신 상에 제공될 수 있으며, 그리고/또는 사후적인 사용을 위하여 상기 스트리밍 서버(220)의 기록매체(225)에 저장될 수 있다.Through the encoder (217), an encoded bitstream (219) having a reduced capacity compared to the original video stream can be output. The bitstream (219) can be provided in real time via a relay device, which may be referred to as a streaming server (220), and/or can be stored in a recording medium (225) of the streaming server (220) for subsequent use.

상기 스트리밍 시스템(200)에는 상기 스트리밍 서버(220)에 접속하여 상기 인코딩된 비트열(229)을 실시간으로 수신하거나 또는 사후적으로 획득하고자 하는 적어도 하나의 스트리밍 클라이언트(230, 240)가 포함될 수 있다. 상기 스트리밍 클라이언트는 상기 부호화된 비트열(229)(상기 스트리밍 서버가 수신한 비트열(219)의 사본으로도 간주할 수 있다)을 획득하여, 상기 비트열(229)을 복호화하고, 그 결과인 비디오 데이터를 디스플레이(235) 또는 그 외의 시각적, 청각적, 기타 감각적 표시수단에 의하여 표시할 수 있는 형태의 비디오 데이터로 출력하는 비디오 복호화기(232)를 포함할 수 있다. The streaming system (200) may include at least one streaming client (230, 240) that connects to the streaming server (220) to receive the encoded bit string (229) in real time or obtain it later. The streaming client may include a video decoder (232) that obtains the encoded bit string (229) (which may also be considered as a copy of the bit string (219) received by the streaming server), decodes the bit string (229), and outputs the resulting video data as video data in a form that can be displayed by a display (235) or other visual, auditory, or other sensory display means.

상술하는 바와 같이 비디오 데이터를 부호화하고 복호화하기 위한 기능을 통칭하여 부호화-복호화계통(coder-and-decoder), 즉 비디오 코덱(codec)으로 부른다.As described above, the functions for encoding and decoding video data are collectively called a coder-and-decoder, or video codec.

도 3은 본 발명의 일 실시예에 따른 비디오 복호화기의 기능부 단위 개념도이다. 도 3에 나타나는 바와 같이, 수신부(310)는 복호화기(305)에 의해 복호화될 적어도 하나의 부호화된 비디오 데이터를 수신할 수 있다. 본 발명의 일 실시예에 있어서, 상기 부호화된 비디오 데이터는 매 수신마다 독립적일 수 있으며, 상기 각각의 독립적인 비디오 데이터의 복호화 절차는 다른 비디오 데이터의 복호화 절차로부터 독립적일 수 있다. 상기 부호화된 비디오 데이터는 이를 저장하는 장치에 대한 하드웨어적 또는 소프트웨어적 연결(315)을 통해 상기 수신부(310)로 수신될 수 있으며, 상술한 바와 같이 상기 저장하는 장치는 통신 네트워크의 상대측에 위치하는 일종의 스트리밍 서버이거나, 또는 물리적인 기록매체를 의미할 수도 있으며, 이에 한정되지 아니한다.FIG. 3 is a conceptual diagram of a functional unit of a video decoder according to an embodiment of the present invention. As shown in FIG. 3, a receiving unit (310) can receive at least one encoded video data to be decoded by a decoder (305). In an embodiment of the present invention, the encoded video data may be independent for each reception, and the decoding procedure of each independent video data may be independent from the decoding procedure of other video data. The encoded video data may be received by the receiving unit (310) through a hardware or software connection (315) to a device storing the same, and as described above, the storing device may be a type of streaming server located at the other end of a communication network, or may mean a physical recording medium, but is not limited thereto.

상기 수신부(310)는 상기 부호화된 비디오 데이터를 그에 동반하는 다른 데이터, 예를 들면 부호화된 오디오 데이터나 그 밖의 보조 데이터들과 함께 수신할 수 있고, 상기 각각의 데이터는 상기 비디오 데이터와 분리되어 상기 비디오 복호화기 외의 적절한 처리 기능부(312)에 제공될 수 있다.The above-described receiving unit (310) can receive the encoded video data together with other data accompanying it, such as encoded audio data or other auxiliary data, and each of the data can be separated from the video data and provided to an appropriate processing function unit (312) other than the video decoder.

통신 네트워크를 통해 상기 비디오 데이터를 제공받는 경우, 네트워크 환경에 따른 지연 및 끊김을 최소화하기 위하여, 상기 수신부(310)와 상기 복호화기(305) 사이에는 버퍼 메모리(320)가 결합될 수 있다. 상기 버퍼 메모리(320)는 상기 수신된 비디오 데이터를 일시적으로 저장하여 상기 복호화기(305)의 입력단에 해당하는 파서(parser)(330)에 안정적으로 공급하는 기능을 하는 컴퓨터 판독 가능한 기록매체를 의미할 수 있다. 단, 통신 네트워크의 대역폭이 충분하거나, 물리적으로 이격되지 않은 로컬 위치의 기록매체로부터 비디오 데이터를 읽어들이고 있거나, 그 밖에 통신 지연의 가능성이 예측되지 않는 환경에서라면, 상기 버퍼 메모리는 불필요할 수 있다.When the video data is provided through a communication network, a buffer memory (320) may be coupled between the receiving unit (310) and the decoder (305) to minimize delay and disconnection according to the network environment. The buffer memory (320) may refer to a computer-readable recording medium that temporarily stores the received video data and stably supplies it to a parser (330) corresponding to the input terminal of the decoder (305). However, if the bandwidth of the communication network is sufficient, if video data is read from a recording medium in a local location that is not physically separated, or if the possibility of communication delay is not predicted in other environments, the buffer memory may be unnecessary.

비디오 복호화기(305)는 상기 부호화된 비디오 데이터를 해석하기 위해 그 입력단으로서 상기 파서(330)를 포함할 수 있다. 상기 파서는 상기 부호화된 비디오 데이터에 비트열의 형태로 저장된 다수의 정보를 소정 규칙에 의하여 분리(파싱; parsing)하고, 필요에 따라서는 엔트로피 부호화(entropy coding)된 비디오 데이터를 엔트로피 복호화(entropy decoding)(335)하는 기능을 수행함으로써, 비디오 부호화 정보의 단락들인 심볼(symbol)들(338)을 재구성하는 기능을 수행할 수 있다. 상기 심볼(338)은 상기 복호화기(305)의 동작을 제어하기 위한 일체의 정보를 포함할 수 있으며, 그리고/또는 상기 복호화기(305)에 부속되어 동작할 수 있는 장치, 이를테면 디스플레이와 같은 표시 장치의 제어를 위한 정보를 더 포함할 수 있다. 상기 표시 장치의 제어를 위한 제어 정보는 보조적 강화 정보(supplementary enhancement information; SEI) 또는 비디오 가용성 정보(video usability information; VUI)로 칭해지는 형식의 정보를 포함할 수 있다.The video decoder (305) may include the parser (330) as its input terminal to interpret the encoded video data. The parser may perform a function of separating (parsing) a plurality of pieces of information stored in the form of a bit string in the encoded video data according to a predetermined rule, and, if necessary, performing an entropy decoding (335) of entropy-coded video data, thereby performing a function of reconstructing symbols (338), which are paragraphs of video encoding information. The symbols (338) may include all information for controlling the operation of the decoder (305), and/or may further include information for controlling a device that is attached to and operable with the decoder (305), such as a display device. Control information for controlling the above display device may include information in a format called supplementary enhancement information (SEI) or video usability information (VUI).

상술한 바와 같이 상기 파서(330)는 상기 부호화된 비디오 데이터의 엔트로피 복호화(335)를 수행하도록 구성될 수 있다. 상기 부호화된 비디오 데이터의 엔트로피 부호화 방법은 상기 부호화의 규격에 따라 상이할 수 있으며, 이에 대응하여 복호화가 이루어질 수 있다. 상기 엔트로피 부호화 규격의 대표적인 예를 들면, 가변 길이 부호화(variable length coding), 허프만 부호화(huffman coding), 산술 부호화(arithmetic coding)를 포함할 수 있으며, 상기 각각의 부호화 방법은 규격에 따라서 맥락 적응적(context-adaptive) 또는 맥락 민감성(context-sensitive) 방법일 수 있으며, 그 밖에 통상의 기술자에게 널리 알려진 원리들에 의한 것일 수 있다. As described above, the parser (330) may be configured to perform entropy decoding (335) of the encoded video data. The entropy encoding method of the encoded video data may vary depending on the encoding standard, and decoding may be performed accordingly. Representative examples of the entropy encoding standard may include variable length coding, Huffman coding, and arithmetic coding, and each of the encoding methods may be a context-adaptive or context-sensitive method depending on the standard, and may also be based on principles widely known to those skilled in the art.

상기 파서(330)는 상기 부호화된 비디오 데이터로부터 적어도 하나의 픽쳐(picture)를 추출하도록 구성될 수 있다. 상기 픽쳐의 정의는 상기 부호화의 규격에 따라 상이할 수 있으며, 규격에 따라서 이하 열거하는 예시 중 하나가, 또는 다수가 동시에 중첩적으로 해당할 수 있다. 상기 픽쳐는 예를 들어, 픽쳐 그룹(group of pictures; GOPs), 픽쳐(pictures)/프레임(frames), 타일(tiles), 슬라이스(slices), 매크로블록(macroblocks), 블록(blocks), 서브블록(subblocks), 변환단위(transform units; TUs), 예측단위(prediction units; PUs)와 같은 부호화/복호화 단위로 그룹화되거나, 정의되거나, 및/또는 분할될 수 있다.The parser (330) may be configured to extract at least one picture from the encoded video data. The definition of the picture may vary depending on the encoding standard, and depending on the standard, one or more of the examples listed below may correspond simultaneously and overlappingly. The picture may be grouped, defined, and/or divided into encoding/decoding units such as, for example, groups of pictures (GOPs), pictures/frames, tiles, slices, macroblocks, blocks, subblocks, transform units (TUs), and prediction units (PUs).

상기 파서(330)는 상기 부호화된 비디오 데이터로부터 변환 계수(tranform coefficients), 양자화 파라미터(quantization parameters; QPs), 및/또는 움직임 벡터(motion vectors)와 같은 부호화 정보를 추출하도록 구성될 수 있다. 상기 파서(330)는 상기 버퍼 메모리로부터 수신된 비디오 데이터에 대하여 엔트로피 복호화(335) 및 파싱 동작을 수행하고, 상기 부호화 정보를 나타내는 심볼(338)들을 선택적으로 복호화하도록 구성될 수 있다. 또한, 상기 파서(330)는 특정 심볼(338)을 복호화기(305) 내부의 특정 복호화 기능부, 예를 들면 역양자화(inverse quantization) 및 역변환부(inverse transform)(340), 인트라 예측(intra prediction)부(350), 인터 예측(inter prediction)부(355), 또는 루프 필터(loop filter)부(360) 등에 선택적으로 공급하도록 구성될 수 있다. 이러한 정보 공급의 제어는 상기 부호화된 비디오에 포함되어 있는 정보 순열에 의하여 결정될 수 있으며, 부호화 규격에 따라 상이할 수 있는 바 본 발명의 실시예 범위 내에서 한정되지는 않는 바, 본 개념도에서는 상세히 기재되지 아니한다.The parser (330) may be configured to extract encoding information, such as transform coefficients, quantization parameters (QPs), and/or motion vectors, from the encoded video data. The parser (330) may be configured to perform entropy decoding (335) and parsing operations on the video data received from the buffer memory, and to selectively decode symbols (338) representing the encoding information. In addition, the parser (330) may be configured to selectively supply a specific symbol (338) to a specific decoding function unit within the decoder (305), such as an inverse quantization and inverse transform unit (340), an intra prediction unit (350), an inter prediction unit (355), or a loop filter unit (360). Control of such information supply can be determined by the information sequence contained in the encoded video, and may vary depending on the encoding standard, and is not limited within the scope of the embodiments of the present invention, and is not described in detail in this conceptual diagram.

상기 복호화기(305)는 상기 파서(330)로부터 상기 부호화 정보를 제공받아 처리하는 다수의 개념적 기능부로 구성될 수 있다. 이러한 개념적 기능부는 구현상의 필요에 따라 서로 결합되거나 또는 더욱 세분될 수 있음이 자명하다. 예를 들어, 구현의 용이함을 위하여 더욱 분리될 수 있고, 동작의 효율성을 위하여 하나로 통합될 수 있다. 어떠한 경우에든, 각각의 기능부는 상호 밀접한 상호작용을 수행하도록 구성될 수 있다. 단, 이러한 통합 또는 분리의 가능성에도 불구하고, 본 발명의 실시예로서 적용되는 비디오 데이터의 복호화 절차를 나타내기 위하여 후술하는 바와 같은 개념적 기능부의 조합으로서 설명하기로 한다.The decoder (305) may be comprised of a number of conceptual functional units that receive and process the encoded information provided by the parser (330). It should be readily apparent that these conceptual functional units may be combined or further subdivided, depending on implementation needs. For example, they may be further separated for ease of implementation, or integrated into one for operational efficiency. In any case, each functional unit may be configured to perform close interaction with each other. However, despite the possibility of such integration or separation, the following description will be given as a combination of conceptual functional units to illustrate the video data decoding procedure applied as an embodiment of the present invention.

상기 복호화기는 역양자화 및 역변환부(340)를 포함할 수 있다. 상기 역양자화 및 역변환부(340)는 상기 파서(330)로부터 수치 변환(transform)에 사용할 방법, 블록의 크기, 양자화된 정보를 복구하기 위한 양자화 계수, 및 상기 양자화 계수를 단순화하여 나타내는 양자화 행렬의 구분 정보 등을 포함하는 부호화 정보를 수신하도록 구성될 수 있으며, 상기 부호화 정보를 처리한 결과로서 병합부(aggregator)(370)에 입력될 수 있는 블록 값(341)들을 출력하도록 구성될 수 있다.The decoder may include an inverse quantization and inverse transformation unit (340). The inverse quantization and inverse transformation unit (340) may be configured to receive encoding information including a method to be used for numerical transformation (transform), a block size, quantization coefficients for recovering quantized information, and distinction information of a quantization matrix that simplifies and represents the quantized coefficients from the parser (330), and may be configured to output block values (341) that can be input to an aggregator (370) as a result of processing the encoding information.

본 발명의 일 실시예에 있어서, 상기 역양자화 및 역변환부(340)의 출력 값들은 인트라 예측 부호화된 블록 값을 포함할 수 있다. 상기 인트라 예측된 블록 값이란, 이전에 복호화된 픽쳐, 가령 이전 프레임으로부터의 예측 정보를 이용하지 아니하나, 현재 복호화 중인 픽쳐, 가령 현재 프레임 내부에서의 예측 정보를 이용하여 복호화될 수 있는 값을 의미할 수 있다.In one embodiment of the present invention, the output values of the inverse quantization and inverse transformation unit (340) may include intra-prediction encoded block values. The intra-predicted block values may refer to values that can be decoded using prediction information within a picture currently being decoded, for example, the current frame, but not using prediction information from a previously decoded picture, for example, the previous frame.

상기 현재 픽쳐 내부에서의 예측 정보는 인트라 예측부(350)에 의하여 제공될 수 있다. 본 발명의 실시예에 따라서, 상기 인트라 예측부(350)는, 현재 복호화 중으로써 부분적으로 복호화가 완료된 픽쳐로부터 도출된, 공간적으로 인접한 영역의 픽쳐 정보를 이용하여 복호화 중인 블록과 동일한 형태의 블록 값을 예측 정보로서 생성한다. 상기 픽쳐 정보는 현재 픽쳐에 대한 버퍼, 이른바 라인 버퍼(line buffer)(380)로부터 제공(381)될 수 있다. 상기 병합부(370)는, 실시예에 따라서, 상기 인트라 예측부(350)가 생성한 예측 정보(351)를 상기 역양자화 및 역변환부(340)가 제공한 블록 값(341)들과 병합하도록 구성될 수 있다.The prediction information within the current picture may be provided by the intra prediction unit (350). According to an embodiment of the present invention, the intra prediction unit (350) generates a block value of the same form as the block being decoded as the prediction information by using picture information of a spatially adjacent area derived from a picture currently being decoded and of which decoding has been partially completed. The picture information may be provided (381) from a buffer for the current picture, a so-called line buffer (380). The merging unit (370), according to an embodiment, may be configured to merge the prediction information (351) generated by the intra prediction unit (350) with the block values (341) provided by the inverse quantization and inverse transformation unit (340).

다른 일 실시예에 있어서, 상기 역양자화 및 역변환부(340)의 출력 값들은 인터 예측 부호화된 블록 값으로, 경우에 따라서는, 움직임 보상(motion compensation)이 이루어진 블록 값을 포함할 수 있다. 이러한 경우, 인터 예측부(355)가 참조 픽쳐 버퍼(reference picture buffer)(385)로부터 움직임 기반의 예측에 사용되는 샘플 정보(386)를 추출하여 사용할 수 있다. 상기 출력 값으로써의 블록 값에 포함된 심볼(338)들에 기반하여 상기 샘플 정보에 대한 움직임 보상을 수행하여 도출된 정보(356)는 상기 병합부(370)에 의해 상기 역양자화 및 역변환부(340)가 제공한 블록 값(341)들과 병합하도록 구성될 수 있다. 이 경우, 상기 블록 값(341)들은 이른바 차분(differntial) 또는 잔차(residual) 값으로 호칭될 수 있다.In another embodiment, the output values of the inverse quantization and inverse transformation unit (340) may include block values subjected to motion compensation as inter-prediction encoded block values, and in some cases, block values subjected to motion compensation. In this case, the inter-prediction unit (355) may extract and use sample information (386) used for motion-based prediction from a reference picture buffer (385). The information (356) derived by performing motion compensation on the sample information based on the symbols (338) included in the block values as the output values may be configured to be merged with the block values (341) provided by the inverse quantization and inverse transformation unit (340) by the merger unit (370). In this case, the block values (341) may be referred to as so-called differential or residual values.

상기 인터 예측부(355)가 상기 샘플 정보를 상기 참조 픽쳐로부터 추출하기 위해 사용하는 메모리 내에서의 위치 정보는, 예를 들어 X, Y, 및 그 밖에 참조 픽쳐의 특정 지점을 나타내기 위한 심볼(338)의 조합으로 구성되어 상기 인터 예측부(355)에 제공되는 움직임 벡터(motion vector)에 의하여 결정될 수 있다. 상기 인터 예측부(355)는 또한, 이른바 '서브샘플링(subsampling)' 가능한 움직임 벡터가 제공된 경우, 상기 샘플 값들을 보간(interpolation)하여 사용할 수 있는 기능을 포함할 수 있으며, 또한, 상기 움직임 벡터의 값을 예측하여 보강하는 기능을 더 포함할 수도 있다.The position information within the memory used by the inter prediction unit (355) to extract the sample information from the reference picture may be determined by a motion vector provided to the inter prediction unit (355) and composed of a combination of symbols (338) for indicating, for example, X, Y, and other specific points of the reference picture. The inter prediction unit (355) may also include a function for interpolating and using the sample values when a so-called 'subsampling'-capable motion vector is provided, and may further include a function for predicting and reinforcing the value of the motion vector.

상기 병합부(370)의 출력 값(371)들은 루프 필터부(360)에 제공되어 다양한 루프 필터링 방법에 의하여 처리될 수 있다. 상기 루프 필터부(360)는 상기 병합부(370)의 블록 단위 출력(371) 뿐 아니라, 상기 파서(330)로부터 제공되는 심볼(338)을 입력받아 그 동작의 제어가 이루어지도록 구성될 수도 있다. 상기 루프 필터부(360)의 출력은 출력 연결(390)을 통해 상기 표시 장치와 같은 외부 표시 수단에 출력될 수 있으나, 사후의 인트라 또는 인터 부호화 블록 값을 해석하기 위한 예측에 사용하기 위하여 라인 버퍼(380)에 저장(361)되고, 또한 이를 거쳐 참조 픽쳐 버퍼(385)에 저장될 수 있다.The output values (371) of the above merging unit (370) may be provided to the loop filter unit (360) and processed by various loop filtering methods. The loop filter unit (360) may be configured to receive not only the block unit output (371) of the merging unit (370) but also the symbol (338) provided from the parser (330) to control its operation. The output of the loop filter unit (360) may be output to an external display means such as the display device through an output connection (390), but may be stored (361) in a line buffer (380) for use in prediction for interpreting a subsequent intra or inter coded block value, and may also be stored in a reference picture buffer (385) through this.

특정한 픽쳐들, 가령 프레임들은, 그 복호화가 완료되고 나면, 사후의 복호화 과정에서 예측 복호화를 수행하기 위한 참조 픽쳐로서 활용될 수 있다. 하나의 픽쳐(또는 프레임)는, 라인 버퍼(380)에 단계적으로 축적되며 복호화가 진행될 수 있으며, 하나의 프레임이 복호화 완료되는 경우, 상기 라인 버퍼(380)의 내용은 상기 참조 픽쳐 버퍼(385)로 이전(383)되고, 새로운 라인 버퍼(380)가 새로운 프레임의 복호화를 위하여 할당될 수 있다.Certain pictures, such as frames, after their decoding is completed, can be utilized as reference pictures for performing predictive decoding in a subsequent decoding process. A picture (or frame) can be gradually accumulated in a line buffer (380) and decoded. When the decoding of a frame is completed, the contents of the line buffer (380) are transferred (383) to the reference picture buffer (385), and a new line buffer (380) can be allocated for decoding the new frame.

상기 비디오 복호화기(305)는, 다양한 국제 표준 규격 또는 상용 규격에 의해 문서화될 수 있는 미리 결정된 비디오 압축 기술에 따라 복호화 동작을 수행하도록 구성될 수 있다. 상기 규격이란 예를 들어, 국제전기통신연합 표준화분과(ITU-T)가 규정한 국제표준권고안(recommendation)인 H.264, H.265, H.266 등을 포함할 수 있다. 통상의 기술자는 상기 각각의 권고안이 국제표준화기구(ISO) 및 국제전기기술회의(IEC)에 의하여 공동으로 규정된 국제표준과 등가임을 이해할 것이다. 부호화된 비디오 데이터는, 비디오 압축 규격 문서 및 표준 문서에서, 그리고 구체적으로는 그러한 문서 내부에서 지정하는 프로파일(profile) 및 레벨(level)에 의하여 정의되고 또한 요구되는 바에 따라서, 해당 규격이 정의하고 있는 특정한 비트열 구문(bitstream syntax)을 준수할 수 있다. 상기 프로파일과 레벨의 준수를 위해서는 그 밖에도 부호화된 비디오 데이터의 복잡성이 일정 수준으로 제한될 수 있다. 예를 들어, 어떠한 프로파일 또는 레벨은 최대 픽쳐 크기, 최대 복호화 속도, 및 최대 참조 픽쳐 규모를 제한하도록 구성될 수 있다. 상기 제한 사항들은 또한, 일부 실시예들에 있어서는, 가상적 기준 복호화기(hypotherical reference decoder; HRD) 및 부호화된 비디오 데이터에 포함되어 있는 상기 HRD 버퍼 관리에 대한 메타데이터 신호를 통하여 추가적으로 제한될 수도 있다.The above video decoder (305) may be configured to perform a decoding operation according to a predetermined video compression technique that may be documented by various international standards or commercial standards. The standards may include, for example, international standard recommendations such as H.264, H.265, and H.266 defined by the International Telecommunication Union Standardization Sub-Division (ITU-T). Those skilled in the art will understand that each of the above recommendations is equivalent to an international standard jointly defined by the International Organization for Standardization (ISO) and the International Electrotechnical Commission (IEC). The encoded video data may comply with a specific bitstream syntax defined by the relevant standards, as defined and required by the video compression standard documents and standard documents, and specifically by the profiles and levels specified within such documents. In addition, the complexity of the encoded video data may be limited to a certain level to comply with the profiles and levels. For example, a profile or level may be configured to limit a maximum picture size, a maximum decoding speed, and a maximum reference picture size. These limitations may, in some embodiments, also be further limited by metadata signals for a hypothetical reference decoder (HRD) and HRD buffer management included in the encoded video data.

본 발명의 일 실시예에 따르면, 상기 수신부(310)는 상기 인코딩된 비디오와 함께 추가적인 중복 데이터를 수신할 수 있다. 상기 추가적인 데이터는 상기 부호화된 비디오 데이터의 일부로서 간주될 수 있다. 상기 추가적인 데이터는 데이터를 적절히 복호화하기 위해서, 또는 부호화 전 영상에 근접하는 영상을 보다 정확하게 재구성하기 위해서, 상기 복호화기(305)에 의하여 사용될 수 있는 정보를 포함할 수 있다. 상기 추가적인 데이터는 예를 들어, 시간, 공간, 또는 신호 대 잡음 비(SNR) 향상을 위한 계층들, 중복 슬라이스들, 중복 픽쳐들, 및 순방향 오류 정정 코드들과 같은 형태로 제공될 수 있다.According to one embodiment of the present invention, the receiver (310) may receive additional redundant data along with the encoded video. The additional data may be considered as part of the encoded video data. The additional data may include information that may be used by the decoder (305) to properly decode the data or to more accurately reconstruct an image that approximates the original image. The additional data may be provided in the form of, for example, layers for temporal, spatial, or signal-to-noise ratio (SNR) enhancement, redundant slices, redundant pictures, and forward error correction codes.

도 4는 본 발명의 일 실시예에 따른 비디오 부호화기의 기능부 단위 개념도이다. 상기 부호화기(405)는 비디오 소스(401)로부터 원본 비디오 정보(402)를 수신하여 부호화를 실시하도록 구성될 수 있다.Figure 4 is a functional unit conceptual diagram of a video encoder according to one embodiment of the present invention. The encoder (405) may be configured to receive original video information (402) from a video source (401) and perform encoding.

상기 원본 비디오 정보(402)는 임의의 적합한 비트 심도(bit depth)를 가질 수 있으며, 예를 들어 8비트, 10비트, 12비트 등을 가질 수 있다. 또한, 상기 원본 비디오 정보(402)는 임의의 적합한 색공간을 가질 수 있으며, 예를 들어, R/G/B, Y/U/V, Y/Cb/Cr 등을 가질 수 있다. 또한, 상기 원본 비디오 정보(402)는 상기 색공간에 대응하는 임의의 적합한 샘플링 구조를 가질 수 있으며, 예를 들어 Y/Cb/Cr 4:2:0, Y/Cb/Cr 4:4:4와 같은 형태를 가질 수 있다. 이러한 소정의 형식을 가진 원본 비디오 정보(402)는 디지털 비디오 스트림의 형태로 상기 부호화기에 제공될 수 있다.The original video information (402) may have any suitable bit depth, for example, 8 bits, 10 bits, 12 bits, etc. In addition, the original video information (402) may have any suitable color space, for example, R/G/B, Y/U/V, Y/Cb/Cr, etc. In addition, the original video information (402) may have any suitable sampling structure corresponding to the color space, for example, Y/Cb/Cr 4:2:0, Y/Cb/Cr 4:4:4, etc. The original video information (402) having such a predetermined format may be provided to the encoder in the form of a digital video stream.

단방향 비디오 통신 네트워크에 있어서, 상기 원본 비디오 정보(402)는, 사전에 준비되어 있는 비디오 원본을 저장해 둔 기록매체로부터 획득될 수 있다. 양방향 비디오 통신 네트워크에 있어서, 상기 원본 비디오 정보(402)는 상기 양방향 비디오 통신에 포함되는 적어도 하나의 비디오 송출 스트림을 생성하는 영상 취득 장치, 예를 들어 카메라와 같은 장치로부터 획득될 수 있다.In a one-way video communication network, the original video information (402) can be obtained from a recording medium storing a previously prepared video source. In a two-way video communication network, the original video information (402) can be obtained from an image acquisition device, such as a camera, that generates at least one video transmission stream included in the two-way video communication.

상기 원본 비디오 정보(402)를 포함하는 비디오 데이터는, 시간 순서에 따라 재생함으로써 움직임을 모사하도록 구성되는 복수의 픽쳐로서 구성될 수 있다. 상기 픽쳐는 픽쳐 이외에도 프레임(frame)과 같은 개념으로도 표현될 수 있다. 상기 픽쳐는 사용 중인 샘플링 구조, 색공간 등의 유형에 따라 하나 이상의 샘플(sample)을 포함할 수 있다. 통상의 기술자는 상기 샘플과 디지털 영상에서의 픽셀(pixel)이 밀접한 관련을 가지는 용어임을 이해할 수 있을 것이다. 이하 이 같은 샘플을 중심으로 하여 부호화기의 동작을 설명한다.The video data including the above original video information (402) may be configured as a plurality of pictures configured to simulate motion by playing them in chronological order. In addition to pictures, the pictures may also be expressed in concepts such as frames. The pictures may include one or more samples depending on the type of sampling structure, color space, etc. being used. Those skilled in the art will understand that the samples are closely related terms to pixels in digital images. The operation of the encoder will be described below with reference to such samples.

본 발명의 일 실시예에 따르면, 부호화기(405)는 상기 원본 비디오 정보(402)를 구성하는 픽쳐(및/또는 그것이 그룹화되거나 분할된 정보)들을 실시간으로 (또는 실시방법에 따라 필요로 되는 다른 시간적 요구조건에 의하여) 부호화된 비디오 정보의 형태로 부호화 및 압축하도록 구성될 수 있다.According to one embodiment of the present invention, the encoder (405) may be configured to encode and compress pictures (and/or grouped or segmented information) constituting the original video information (402) in real time (or according to other temporal requirements required according to the implementation method) into the form of encoded video information.

상기 부호화기(405)에 있어, 제어부(450)는 적절한 부호화 속도를 제어하도록 구성되는 기능부일 수 있다. 상기 제어부(450)는 이하 설명되는 바와 같이 다른 기능부들을 제어하고 하기 기능부들에 기능적으로 결합되도록 구성될 수 있다. 상기 제어부(450)에 의해 설정되는 파라미터들은 비트 전송율(bitrate) 제어에 관련된 파라미터, 예를 들면 픽쳐의 스킵(skip), 양자화기(quantizer), 화질 최적화 기법의 적용을 위한 변수값 등을 포함할 수 있으며, 또한 픽쳐의 크기, 픽쳐 그룹(group of pictures; GOP)의 구조, 움직임 벡터의 최대 검색 범위와 같은 값을 포함할 수 있다. 통상의 기술자는 상기 제어부(450)가 가질 수 있는 다양한 다른 기능들에 대하여 이해할 수 있을 것이며, 그러한 다른 기능들은 개별적 시스템 설계에 최적화된 비디오 부호화기의 설계에 따라 부가 또는 제거될 수 있는 것들이다.In the encoder (405), the control unit (450) may be a functional unit configured to control an appropriate encoding speed. The control unit (450) may be configured to control other functional units and be functionally coupled to the following functional units as described below. The parameters set by the control unit (450) may include parameters related to bitrate control, such as picture skip, quantizer, and variable values for applying image quality optimization techniques, and may also include values such as picture size, group of pictures (GOP) structure, and maximum search range of motion vectors. A person skilled in the art will be able to understand various other functions that the control unit (450) may have, and such other functions may be added or removed according to the design of a video encoder optimized for individual system design.

본 발명의 실시예에 따라서, 상기 부호화기(405)는 통상의 기술자에게 잘 알려진 "코딩 루프(coding loop)"와 같은 구조로 동작하도록 구성될 수 있다. 예시적으로 단순화하여 설명하면, 상기 코딩 루프는, 부호화될 픽쳐를 입력받고 종래에 부호화한 적어도 하나의 참조 픽쳐에 기반하여 심벌(symbol)들을 생성하는 것을 담당하는 내부 부호화기(이른바 "소스 코더(source coder)")(410), 및 상기 내부 부호화기에 접속되도록 구성되는 내부 복호화기(local decoder)(420)로 구성될 수 있다. 상기 내부 복호화기(420)는 상기 내부 부호화기(410)의 출력을 제공받음으로써, 상기 부호화기(405)로부터 부호화된 비디오 정보를 전달받게 될 실제 원격지에 있는 복호화기(490)가 생성하게 될 샘플 데이터를 재현하기 위한 동작을 수행하도록 구성될 수 있다. According to an embodiment of the present invention, the encoder (405) may be configured to operate in a structure such as a "coding loop" well known to those skilled in the art. To simplify the description by way of example, the coding loop may be configured with an internal encoder (so-called "source coder") (410) which is responsible for receiving a picture to be encoded and generating symbols based on at least one reference picture that has been encoded in the past, and a local decoder (420) configured to be connected to the internal encoder. The local decoder (420) may be configured to perform an operation to reproduce sample data to be generated by a decoder (490) located at an actual remote location that will receive encoded video information from the encoder (405) by receiving an output of the internal encoder (410).

상기 내부 복호화기(420)에 의하여 재구성된 샘플 데이터로 구성된 비디오 데이터는 상기 부호화기(405)의 참조 픽쳐 버퍼에 입력되도록 구성될 수 있다. 상술한 바와 같이 상기 내부 복호화기(420)는 상기 부호화기(405)가 출력하여 원격지의 복호화기에서 복호화될 결과물을 재현하도록 구현되었으므로, 상기 참조 픽쳐 버퍼에 기록되는 비디오 데이터 또한 원격지의 복호화기가 가지는 참조 픽쳐 버퍼의 정보와 비트 단위로 동일할 수 있다. 즉, 상기 부호화기(405)에 포함될 수 있는 예측 기능부는 추후 복호화기가 복호화 과정에서 참조하게 될 이전 프레임의 샘플 값들과 동일한 값들을 상기 부호화기(405)의 참조 픽쳐 버퍼로부터 읽어들일 수 있다.Video data composed of sample data reconstructed by the internal decoder (420) may be configured to be input to the reference picture buffer of the encoder (405). As described above, the internal decoder (420) is implemented to reproduce the result output by the encoder (405) and to be decoded by a remote decoder, so the video data recorded in the reference picture buffer may also be identical in bit unit to the information of the reference picture buffer of the remote decoder. That is, the prediction function unit that may be included in the encoder (405) may read the same values as the sample values of the previous frame that the decoder will later refer to in the decoding process from the reference picture buffer of the encoder (405).

상술하는 바와 같이 부호화기(405) 측의 내부 복호화기(420)에 의하여 부호화기(405)와 복호화기(490) 간 참조 픽쳐 버퍼의 일치를 달성하는 원리는 통상의 기술자에게 널리 알려진 바에 따르며, 또한 그러한 환경이 보장되지 아니하는 환경(예를 들어, 통신 장애로 인한 정보 유실 등)에 대응하는 방법 또한 통상의 기술자에게 공지된 바를 따를 수 있다.As described above, the principle of achieving matching of the reference picture buffers between the encoder (405) and the decoder (490) by means of the internal decoder (420) on the encoder (405) side is well known to those skilled in the art, and a method of responding to an environment in which such an environment is not guaranteed (e.g., information loss due to communication failure, etc.) can also follow what is known to those skilled in the art.

상기 내부 복호화기(420)의 동작 방법의 일 실시예는 앞서 도 3을 참조하여 상세히 설명하였다. 상기 도 3의 복호화기는 상술한 "원격지"의 복호화기(490)로 간주될 수 있는 것이다. 상기 내부 복호화기(420)는 파서(330)나 엔트로피 복호화(335)와 같은 무손실 부호화 및 복호화 구간은 제외하고 구현될 수 있는데, 이는 상기 내부 부호화기(405)가 원격지에 있는 복호화기의 동작을 단순히 재현하기 위하여 구현되는 것이므로, 심볼을 압축 후 재복원하는 과정을 요하지 않고 심볼을 바로 복호화하여도 무방하기 때문이다. 따라서 도 3에 나타나는 것과 같은 파서 및 엔트로피 복호화기를 포함하여 이에 선행하는 기능부들은 구비되지 아니하거나 적어도 부분적으로만 구현되더라도 무방할 수 있다.An embodiment of the operation method of the internal decoder (420) has been described in detail above with reference to FIG. 3. The decoder of FIG. 3 may be regarded as the aforementioned "remote" decoder (490). The internal decoder (420) may be implemented excluding lossless encoding and decoding sections such as the parser (330) or entropy decoding (335). This is because the internal encoder (405) is implemented to simply reproduce the operation of a decoder located at a remote location, and thus may directly decode symbols without requiring a process of compressing and then decompressing symbols. Accordingly, the functional units preceding the parser and entropy decoder as shown in FIG. 3 may not be provided or may be implemented at least partially.

상술하는 바와 같이, 본 발명의 바람직한 실시방법에 따르면, 복호화기에 존재하는 (파서 및 엔트로피 복호화기를 제외할 수 있는) 임의의 복호화기 기능부는 자연히 대응하는 부호화기(405)에서 실질적으로 동일한 기능부로서 존재할 수 있다.As described above, according to a preferred embodiment of the present invention, any decoder function (excluding a parser and an entropy decoder) present in the decoder can naturally exist as a substantially identical function in the corresponding encoder (405).

상기 부호화기(405)에 포함될 수 있는 부호화 기능부의 동작은 상기 복호화기 기능부의 역동작으로 간주할 수 있다. 따라서 대체로는 상기 복호화기 기능부의 동작을 반대로 수행하는 것으로써 그 실시예를 해설할 수 있다. 예를 들어, 역양자화 및 역변환부에 대응하는 양자화(quantization) 및 변환(transform) 기능부가 제공될 수 있으며, 인터 예측부에 대응하는 인터 예측 부호화부가 제공될 수 있는 것과 같다. 이에 더하여, 일부 추가적으로 설명을 부가하기로 한다.The operation of the encoding function unit that may be included in the above encoder (405) can be considered as the reverse operation of the decoder function unit. Therefore, the embodiment can be explained by performing the operation of the decoder function unit in reverse. For example, a quantization and transform function unit corresponding to the inverse quantization and inverse transform unit may be provided, and an inter prediction encoding unit corresponding to the inter prediction unit may be provided. In addition, some additional explanations will be added.

상기 내부 부호화기(410)는 적어도 하나의 참조 픽쳐 정보, 예를 들어 참조 프레임으로 지정된 비디오 데이터로부터 적어도 하나의 시간적으로 이전 순서에 부호화된 픽쳐(또는 프레임)들을 참조 픽쳐 버퍼(430)로부터 참조하여 동작하는 예측 부호화부(440)에 의해 실행되는 예측 부호화 방법에 의해, 입력된 픽쳐 정보, 가령 입력 프레임에 대한 부호화를 수행하도록 구성될 수 있다. 이러한 경우, 상기 부호화기(405)는 상기 입력 픽쳐를 구성하는 샘플들의 블록들과 상기 참조 픽쳐를 구성하는 샘플들의 블록들 간에 차분(differential)을 부호화하도록 구성될 수 있다.The internal encoder (410) may be configured to perform encoding on input picture information, for example, an input frame, by a prediction encoding method executed by a prediction encoding unit (440) that operates by referencing at least one reference picture information, for example, at least one temporally previous encoded picture (or frame) from a reference picture buffer (430) from video data designated as a reference frame. In this case, the encoder (405) may be configured to encode a differential between blocks of samples constituting the input picture and blocks of samples constituting the reference picture.

상기 내부 복호화기(420)는 상기 내부 부호화기(410)에 의해 생성된 심볼들로부터 상기 참조 픽쳐로서 지정될 수 있는 비디오 데이터를 복호화할 수 있다. 상술한 바와 같이 이러한 비디오 데이터는 원격지의 복호화기가 수행하는 복호화 동작과 동일하므로, 상기 참고 픽쳐로 사용되는 비디오 데이터는 손실 압축을 경과하여 일부 손상이 발생한 형태로 상기 부호화기(405)에 제공될 수 있으며, 이러한 동작은 복호화기와의 동작 일치를 위하여 의도된 것일 수 있다.The internal decoder (420) can decode video data that can be designated as the reference picture from symbols generated by the internal encoder (410). As described above, since the video data is identical to the decoding operation performed by the remote decoder, the video data used as the reference picture may be provided to the encoder (405) in a form that has undergone lossy compression and has been partially damaged, and this operation may be intended to ensure operational consistency with the decoder.

예측 부호화부(440)는 상기 부호화기(405) 내부에서 예측 검색 동작을 수행하도록 구성될 수 있다. 상기 예측 검색 동작은 상기 복호화기의 설명에서 설명하였던 인터 예측 또는 인트라 예측에 대응하는 동작을 의미할 수 있다. 입력되어 새로이 부호화가 예정된 픽쳐 정보에 대하여, 상기 예측부는 새로운 픽쳐 정보에 적합한 예측 참조 정보로 기능할 수 있는 참조 픽쳐의 지점을 나타내는 정보인 움직임 벡터, 블록 형상, 및 이를 포함할 수 있는 메타데이터, 및 실제 참조될 샘플 블록의 획득하기 위하여 상기 참조 픽쳐 버퍼(430)에 접속하여 정보를 인출할 수 있다. 상기 예측 부호화부(440)는 적절한 예측 참조 정보를 획득하기 위하여 이른바 "샘플 블록 대 픽셀 블록(sample block by pixel block)의 기준에 의하여 동작할 수 있다. 본 발명의 일 실시예에 따라서, 상기 예측 부호화부(440)에 의하여 획득된 검색 결과들에 기반하여 결정되는 바와 같이, 상기 입력 픽쳐에 대해서는 상기 참조 픽쳐 버퍼(430)에 저장된 적어도 하나의 참조 픽쳐 정보를 지목하는 적어도 하나의 예측 참조 정보가 지정될 수 있다.The prediction encoding unit (440) may be configured to perform a prediction search operation within the encoder (405). The prediction search operation may refer to an operation corresponding to the inter prediction or intra prediction described in the description of the decoder. For picture information that is input and scheduled to be newly encoded, the prediction unit may access the reference picture buffer (430) to retrieve information such as a motion vector, a block shape, and metadata that may include the same, which are information indicating a point of a reference picture that can function as prediction reference information suitable for the new picture information, and a sample block to be actually referenced. The above prediction encoding unit (440) may operate on the basis of the so-called "sample block by pixel block" criteria in order to obtain appropriate prediction reference information. According to one embodiment of the present invention, at least one prediction reference information designating at least one reference picture information stored in the reference picture buffer (430) may be designated for the input picture, as determined based on the search results obtained by the prediction encoding unit (440).

본 발명의 일 실시예에 있어, 상기 제어부(450)는, 비디오 데이터를 부호화하기 위하여 사용되는 파라미터들의 설정을 포함하여, 내부 부호화기(410)의 부호화 동작 전반을 관리하도록 구성될 수 있다.In one embodiment of the present invention, the control unit (450) may be configured to manage the overall encoding operation of the internal encoder (410), including setting parameters used to encode video data.

상술한 모든 기능부의 출력들은 최종적으로 출력되기 위하여 엔트로피 부호화(460)의 대상이 될 수 있다. 상기 엔트로피 부호화(460)는 상기 다양한 기능부에 의하여 생성된 심볼들을 앞서 서술한 것과 같은 다양한 엔트로피 코딩 기법, 예를 들어 가변 길이 부호화(variable length coding), 허프만 부호화(huffman coding), 산술 부호화(arithmetic coding)를 포함할 수 있으며, 상기 각각의 부호화 방법은 규격에 따라서 맥락 적응적(context-adaptive) 또는 맥락 민감성(context-sensitive) 방법일 수 있으며, 그 밖에 통상의 기술자에게 널리 알려진 원리들에 의한 것일 수 있다. 이러한 엔트로피 부호화(460)는 통상적으로 무손실 압축을 달성할 수 있으며, 이에 따라 상기 기능부들이 생성한 적어도 하나의 심볼을 부호화된 비디오 데이터로 변환하도록 구성될 수 있다.All outputs of the above-described functional units may be subjected to entropy encoding (460) in order to be finally output. The entropy encoding (460) may include various entropy coding techniques, such as variable length coding, Huffman coding, and arithmetic coding, for the symbols generated by the various functional units as described above, and each encoding method may be a context-adaptive or context-sensitive method according to the standard, or may be based on principles widely known to those skilled in the art. Such entropy encoding (460) can typically achieve lossless compression, and thus can be configured to convert at least one symbol generated by the functional units into encoded video data.

상기 제어부(450)는 상기 부호화기(405)의 동작을 제어함에 있어, 부호화 구간 동안 특정 픽쳐가 부호화되는 유형을 각각의 픽쳐(또는 프레임)에 적용할 수 있다. 상기 유형에 따라서 상기 픽쳐가 부호화되는 방식에 영향이 있을 수 있다. 상기 유형은, 실시예에 따라서는, 다음와 같은 "프레임 유형"으로 구분되는 것을 포함할 수 있다.The above control unit (450) may, when controlling the operation of the encoder (405), apply to each picture (or frame) the type of encoding in which a specific picture is encoded during the encoding period. Depending on the type, the method by which the picture is encoded may be affected. Depending on the embodiment, the type may include what is categorized as the following "frame type."

도 5는 본 발명의 일 실시예에 의한 프레임 유형의 개념도이다. 이하 도 5를 함께 참조하여 설명한다.Fig. 5 is a conceptual diagram of a frame type according to one embodiment of the present invention. The following description will be made with reference to Fig. 5.

인트라(intra, "I") 픽쳐(510)는, 예측 부호화로 비디오 데이터 내의 다른 픽쳐 정보를 참조하지 않고 스스로의 정보만으로 부호화되고 또한 복호화될 수 있는 픽쳐를 의미할 수 있다. 상기 "I" 픽쳐는, 비디오 부호화 규격에 따라서 키 프레임(key frame), 독립/즉각적 디코더 갱신(independent/instantaneous decoder refersh; IDR) 프레임, 청정 임의접속(clean random-access; CRA) 프레임과 같은 명칭으로 지목될 수 있으며, 상기와 같이 다양한 명칭으로 지목되는 "I" 픽쳐는 각각의 규격이 허용하는 바에 따라 다양한 변형 및 응용 방법을 가지며 서로 부분적으로 상이할 수 있다. 상기 열거한 것 이외에 "I" 픽쳐를 구현하는 다양한 응용 방법은 통상의 기술자에게 이미 알려져 있거나 또는 새로이 제공될 수 있는 다양한 방법에 의할 수 있다.An intra (“I”) picture (510) may refer to a picture that can be encoded and decoded using only its own information without referring to other picture information in video data through predictive encoding. The “I” picture may be designated by names such as a key frame, an independent/instantaneous decoder referh (IDR) frame, and a clean random-access (CRA) frame, depending on the video encoding standard, and the “I” pictures designated by the various names as described above may have various modifications and application methods as permitted by each standard and may be partially different from each other. In addition to those listed above, various application methods for implementing the “I” picture may be by various methods that are already known to those skilled in the art or may be newly provided.

예측(prediction, "P") 픽쳐(520)는, 상기 픽쳐를 구성하는 블록의 샘플 값들을 예측하기 위하여 적어도 하나의 참조 픽쳐를 지목하는 적어도 하나의 예측 정보 및/또는 움직임 벡터에 기반하여 인트라 또는 인터 예측을 통해 부호화되고 또한 복호화될 수 있는 픽쳐를 의미할 수 있다. 상기 "P" 픽쳐는, 비디오 부호화 규격에 따라서 하나의 참조 프레임만을 참조하도록 구성되거나, 또는 하나 이상의 참조 프레임을 참조하도록 구성될 수 있다. 하나 이상의 참조 프레임을 참조하는 경우, 단일 블록의 재구성을 위해 복수의 참조 픽쳐로부터 유래하는 샘플 정보 및/또는 연관된 메타데이터를 사용할 수 있다. 그러나 공통적인 경우에 있어, "P" 픽쳐로 지정된 픽쳐는 시간적으로 선행하는 픽쳐에 한정하여 참조를 실행하는 픽쳐로 이해될 수 있다.A prediction ("P") picture (520) may refer to a picture that can be encoded and decoded through intra or inter prediction based on at least one prediction information and/or a motion vector that designates at least one reference picture to predict sample values of a block constituting the picture. The "P" picture may be configured to refer to only one reference frame, or may be configured to refer to one or more reference frames, according to a video encoding standard. When referring to more than one reference frame, sample information and/or associated metadata derived from multiple reference pictures may be used to reconstruct a single block. However, in common cases, a picture designated as a "P" picture may be understood as a picture that performs reference only to a temporally preceding picture.

양방향 예측(bidirectional prediction, "B") 픽쳐(530)는 상기 픽쳐를 구성하는 블록의 샘플 값들을 예측하기 위하여 적어도 둘 이상의 참조 픽쳐를 지목하는 적어도 하나의 예측 정보 및/또는 움직임 벡터에 기반하여 인트라 또는 인터 예측을 통해 부호화되고 또한 복호화될 수 있는 픽쳐를 의미할 수 있다. 공통적인 경우에 있어, 상기 "B" 픽쳐로 지정된 픽쳐는, 상기 "P" 픽쳐로 지정된 픽쳐와 구별되며, 시간적으로 선행하는 픽쳐에 한정하지 아니하고 참조를 실행하는 픽쳐로 이해될 수 있다.A bidirectional prediction ("B") picture (530) may refer to a picture that can be encoded and decoded through intra or inter prediction based on at least one piece of prediction information and/or a motion vector that designates at least two reference pictures in order to predict sample values of blocks constituting the picture. In a common case, a picture designated as the "B" picture is distinguished from a picture designated as the "P" picture, and may be understood as a picture that performs a reference without being limited to a temporally preceding picture.

비디오 데이터는 부호화 및 복호화의 과정에 있어서 복수의 샘플 블록에 의하여 공간적으로 구분되고, 상기 블록 단위로 부호화가 진행될 수 있다. 상기 블록 단위는 예를 들어, 널리 알려진 것과 같이, 가로/세로 픽셀의 단위로 4x4, 8x8, 4x8, 또는 16x16과 같은 크기를 포함하지만, 이에 한정되지 않는다. 상기 블록은 상기 블록이 포함되는 각각의 픽쳐에 대하여 지정되는 유형이 허용 및/또는 제한하는 바에 따라 임의의 다른 (이미 부호화가 완료된) 블록들을 참조하여 예측 부호화 방법에 의하여 부호화될 수 있다. 예를 들어, "I" 픽쳐(510)의 블록들은 예측 부호화 방법을 사용하지 않거나, 또는 같은 부분 픽쳐 내부에서 이미 부호화가 완료된 블록들을 참조하여 부호화될 수 있다. 즉, 이른바 인트라 예측 방법만이 사용될 수 있다. 그에 비하여, "P" 픽쳐(520)의 경우 적어도 하나의 이전 시간 단위에 부호화된 참조 픽쳐를 더 참조할 수 있으며, 따라서 인트라 예측과 함께 인터 예측 또한 부호화에 사용될 수 있다. "B" 픽쳐(530)의 경우 부호화 순서에 있어서 이전에 부호화되었을 뿐 시간 단위로서는 후행하는 참조 픽쳐 가운데에서도 참조를 실행할 수 있다. 단, "P" 픽쳐 또는 "B" 픽쳐의 내부에서도 예측 부호화에 의존하지 않고 부호화되는 블록들이 존재할 수 있음은 널리 알려져 있다.Video data may be spatially divided into a plurality of sample blocks during the encoding and decoding process, and encoding may be performed in units of the blocks. The block units may include, but are not limited to, sizes such as 4x4, 8x8, 4x8, or 16x16 in units of horizontal/vertical pixels, as is widely known. The block may be encoded using a predictive encoding method with reference to any other (already encoded) blocks, as permitted and/or restricted by the type specified for each picture in which the block is included. For example, the blocks of the "I" picture (510) may be encoded without using a predictive encoding method, or with reference to blocks that have already been encoded within the same partial picture. That is, only the so-called intra prediction method may be used. In contrast, the "P" picture (520) may further reference a reference picture encoded in at least one previous time unit, and thus, inter prediction may also be used for encoding along with intra prediction. In the case of a "B" picture (530), reference can be made not only to a previously encoded picture in the encoding order but also to a later reference picture in terms of time unit. However, it is widely known that there may be blocks within a "P" picture or a "B" picture that are encoded without relying on predictive encoding.

상기 비디오 부호화기(405)는, 다양한 국제 표준 규격 또는 상용 규격에 의해 문서화될 수 있는 미리 결정된 비디오 압축 기술에 따라 부호화 동작을 수행하도록 구성될 수 있다. 상기 규격의 예는 상기 복호화기에서 서술한 것을 모두 포함할 수 있다. The above video encoder (405) may be configured to perform encoding operations according to a predetermined video compression technique that may be documented by various international standards or commercial standards. Examples of the above standards may include all of those described in the above decoder.

본 발명의 일 실시예에 따르면, 송신부(470)는, 상기 부호화된 비디오 데이터를 저장하는 장치에 대하여 하드웨어적 또는 소프트웨어적 연결(495)을 통해 상기 비디오 데이터를 (궁극적으로 원격지의 복호화기(490)에) 제공/송신하기 위하여 상기 엔트로피 부호화에 의하여 생성된 부호화된 비디오 데이터를 버퍼링할 수 있다. 실시예에 따라서, 상기 송신부(470)는 비디오 부호화기(405)로부터 부호화된 비디오 데이터를 제공/송신함에 있어서, 상기 부호화된 비디오 데이터에 동반되는 다른 데이터, 예를 들면 부호화된 오디오 데이터나 그 밖의 보조 데이터들을 별도의 공급원(480)으로부터 제공받아 병합할 수 있다.According to one embodiment of the present invention, the transmitter (470) may buffer the encoded video data generated by the entropy encoding in order to provide/transmit the video data (ultimately to a remote decoder (490)) to a device storing the encoded video data via a hardware or software connection (495). According to an embodiment, when providing/transmitting the encoded video data from the video encoder (405), the transmitter (470) may receive and merge other data accompanying the encoded video data, for example, encoded audio data or other auxiliary data, from a separate source (480).

본 발명의 일 실시예에 따르면, 상기 송신부(470)는 상기 부호화된 비디오와 함께 추가적인 데이터를 더 송신하도록 구성될 수 있다. 상기 추가적인 데이터는 상기 부호화된 비디오 데이터의 일부로서 간주될 수 있다. 상기 추가적인 데이터는 데이터를 적절히 복호화하기 위해서, 또는 부호화 전 영상에 근접하는 영상을 보다 정확하게 재구성하기 위해서, 복호화기에 의하여 사용될 수 있는 정보를 포함할 수 있다. 상기 추가적인 데이터의 예는 앞서 복호화기의 수신부(310)와 관련하여 나타낸 예시를 모두 포함할 수 있다.According to one embodiment of the present invention, the transmitter (470) may be configured to transmit additional data along with the encoded video. The additional data may be considered part of the encoded video data. The additional data may include information that can be used by a decoder to properly decode the data or to more accurately reconstruct an image that approximates the original image. Examples of the additional data may include all of the examples previously presented with respect to the receiver (310) of the decoder.

본 발명은 상술한 바와 같이 통상의 기술자에게 이해되어 널리 사용되는 디지털 동영상 압축 규격에 의하여 구현될 수 있다. 상기 디지털 동영상 압축 규격에는 MPEG-2, MPEG-4 Video, H.263, H.264/AVC, H.265/HEVC, H.266/VVC, VC-1, AV1, QuickTime, VP-9, VP-10, Motion JPEG과 같은 규격명으로 알려진 압축 규격 중 적어도 하나가 포함될 수 있다.The present invention can be implemented by a digital video compression standard that is widely used and understood by those skilled in the art as described above. The digital video compression standard may include at least one of compression standards known by the standard name such as MPEG-2, MPEG-4 Video, H.263, H.264/AVC, H.265/HEVC, H.266/VVC, VC-1, AV1, QuickTime, VP-9, VP-10, and Motion JPEG.

도 6은 본 발명의 다른 실시예에 따른 비디오 부호화기의 구조를 나타내는 개념도이다. 도 6에 도시되어 있는 것은 ITU-T H.266 및 ISO/IEC 23090-3과 같은 표준부호로, 그리고 MPEG-I Part 3이라는 호칭이나 다용도 비디오 부호화(versatile video coding; VVC)라는 통칭으로 널리 알려져 있는 비디오 부호화기의 대략적 구조일 수 있다.Fig. 6 is a conceptual diagram illustrating the structure of a video encoder according to another embodiment of the present invention. What is depicted in Fig. 6 may be a rough structure of a video encoder widely known as a standard code such as ITU-T H.266 and ISO/IEC 23090-3, and also known as MPEG-I Part 3 or versatile video coding (VVC).

도 6에 따르면, 비디오 부호화기(605)는 압축 및 부호화가 되지 않은 원본 비디오 데이터(601)를 입력으로 받아 부호화된 비트열(602)을 출력하도록 구성될 수 있다. 상기 비디오 데이터(601)는 인트라 부호화되는 경우 직접 조도신호 매핑부(luma mapping)(610a)에 공급되거나, 또는 움직임 벡터 추출을 포함하는 인터 예측부(620)를 경유하여 조도신호 매핑부(610b)에 공급될 수 있다. 상기 인트라 부호화되는 경우, 상기 매핑된 조도신호는 단독으로, 또는 인트라 예측부(625)를 경유한 인트라 예측 부호화 신호 또는 상기 인터 예측부(620)를 경유하여 조도신호 매핑부(610b)에서 출력된 인터 예측 부호화 신호 중 적어도 하나를 선택(608)하여, 출력 병합기(606)에 공급될 수 있다. 상기 출력 병합기의 결과물은 색차신호 축소부(chroma scaling)(615)에 인가될 수 있다. (상기 조도신호 매핑부(610)의 동작과 상기 색차신호 축소(615)의 동작을 통칭하여 luma mapping/chroma scalaing(LMCS) 과정으로 통칭하기도 한다.) 축소된 색차신호는 변환부(transform)(630)에 제공될 수 있으며, 상기 변환부(630)에서는 특히 색차 신호에 대하여 적응적 변환을 실시(adaptive color transform)할 수 있다. 상기 변환의 결과 도출되는 계수는 양자화부(640)에 인가되어 양자화된다. 이로서 손실압축이 이루어지고, 상기 손실압축의 결과물은 무손실 압축 방법인 복수가설기반 맥락적응적 산술부호화부(multi-hypothesis CABAC)(650)를 거쳐 비트열(602)로 출력될 수 있다.According to FIG. 6, a video encoder (605) may be configured to receive uncompressed and unencoded original video data (601) as input and output an encoded bit stream (602). The video data (601) may be supplied directly to a luma mapping unit (610a) when intra-encoded, or may be supplied to a luma mapping unit (610b) via an inter-prediction unit (620) including motion vector extraction. In the intra-encoded case, the mapped luma signal may be supplied to an output merger (606) by selecting (608) at least one of an intra-prediction encoded signal via an intra-prediction unit (625) or an inter-prediction encoded signal output from the luma mapping unit (610b) via the inter-prediction unit (620). The result of the above output merger can be applied to a chroma scaling unit (615). (The operation of the luminance signal mapping unit (610) and the operation of the chroma scaling unit (615) are collectively referred to as a luma mapping/chroma scalaing (LMCS) process.) The reduced chroma signal can be provided to a transform unit (630), and the transform unit (630) can perform an adaptive color transform, particularly on the chroma signal. The coefficients derived as a result of the transform are applied to a quantization unit (640) and quantized. As a result, lossy compression is achieved, and the result of the lossy compression can be output as a bit string (602) through a multi-hypothesis CABAC (650), which is a lossless compression method.

한편, 상기 손실압축의 결과물은 코딩 루프를 생성하기 위하여 역양자화(inverse quantization)(645), 역변환(inverse transform)(635), 및 조도신호 확대(617) 과정을 거쳐 실질적으로 복호화 절차에 진입할 수 있다. 상기 조도신호 확대의 결과물은 앞서 생성된 인트라 예측 부호화 신호 또는 인터 예측 부호화 신호 중 적어도 하나를 선택(608)한 결과와 함께 내부 병합기(607)에 공급될 수 있다. 상기 내부 병합기의 결과물은 조도신호 역 매핑(inverse luma mapping)(617)을 거친 후, 복호화기에서의 화질 개선 과정을 재현하기 위한 디블로킹 필터(deblocking filter)(660), 샘플 적응적 오프셋(sample adaptive offset, SAO)(670), 적응적 루프 필터(adaptive loop filter; ALF)와 같은 처리를 거칠 수 있다. 상기와 같이 복호화기에서의 동작을 재현한 결과물은 참조 픽쳐 버퍼(690)로 인가되어 상기 인터 예측부(620)에 의한 예측 부호화에 재활용될 수 있게 된다.Meanwhile, the result of the lossy compression may actually enter the decoding process by going through the processes of inverse quantization (645), inverse transform (635), and luminance signal expansion (617) to generate a coding loop. The result of the luminance signal expansion may be supplied to the internal merger (607) together with the result of selecting (608) at least one of the previously generated intra prediction encoding signal or inter prediction encoding signal. The result of the internal merger may go through inverse luma mapping (617), and then may go through processing such as a deblocking filter (660), sample adaptive offset (SAO) (670), and an adaptive loop filter (ALF) to reproduce the image quality improvement process in the decoder. The result of reproducing the operation in the decoder as described above is applied to the reference picture buffer (690) and can be reused for prediction encoding by the inter prediction unit (620).

본 발명은 또한, 국제 표준화 전문가단체인 연합표준화전문가그룹(joint video experts team; JVET)에서 개발하고 있는 차세대 비디오 코덱의 구현인 개량압축모델(enhanced compression model; ECM)에 의하여 또는 그에 병합되어 이용될 수 있다. 상기 개량압축모델은 개량된 인트라 예측 부호화 방법, 개량된 인터 예측 부호화 방법, 개량된 변환 및 변환 계수 부호화 방법, 개량된 적응적 루프 필터링 방법, 양측성(bilateral) 필터링 방법, 화질 개선을 위한 새로운 샘플 적응적 오프셋(SAO) 방법, 확장된 엔트로피 코딩 방법, 및 개선된 순차적 복호화기 갱신(gradual decoding refresh; GDR) 기술을 포함할 수 있다.The present invention can also be utilized by or incorporated into an enhanced compression model (ECM), which is an implementation of a next-generation video codec currently being developed by the Joint Video Experts Team (JVET), an international standardization expert organization. The enhanced compression model can include an enhanced intra prediction coding method, an enhanced inter prediction coding method, an enhanced transform and transform coefficient coding method, an enhanced adaptive loop filtering method, a bilateral filtering method, a new sample adaptive offset (SAO) method for improving image quality, an extended entropy coding method, and an improved gradual decoding refresh (GDR) technique.

본 발명의 구성Composition of the present invention

도 7은 본 발명의 일 실시예에 따른 IBC(intra block copy)의 개념도이다. 도 7을 참조하면, IBC 기술은 부호화/복호화 대상 블록(current block)(710)의 예측 정보를 동일한 화면(픽쳐, picture) 내부에서 이미 복호화가 완료된 영역(790)에 속하는 참조 블록(reference block)(720)으로부터 획득하도록 구성될 수 있다. 상기 동일한 픽쳐 내에서 상기 참조 블록(720)의 위치를 나타내는 예측 벡터(prediction vector) 정보는 블록 벡터(BV; block vector)(730)를 포함할 수 있다.Fig. 7 is a conceptual diagram of IBC (intra block copy) according to one embodiment of the present invention. Referring to Fig. 7, the IBC technology can be configured to obtain prediction information of a block to be encoded/decoded (current block) (710) from a reference block (720) belonging to an area (790) in which decoding has already been completed within the same screen (picture). Prediction vector information indicating the position of the reference block (720) within the same picture can include a block vector (BV) (730).

단, 본 발명에서 나타내는 상기 예측 벡터는 상술한 바와 같이 화면 내 예측을 위한 블록 벡터(BV)에 한정되지 아니하며, 본 발명은 이를테면 화면 간 예측을 위한 움직임 벡터(MV; motion vector)에 기반하는 응용 기술에도 동일 또는 유사하게 적용될 수 있으며, 또한 후술하는 바와 같이, 상기 BV와 MV를 동시에 사용하는 경우에도 활용될 수 있는 것으로 이해되어야 한다. 또한 상기 MV 및/또는 상기 BV를 비트열 구문에 명시적으로 정의하지 아니하고 복호화기에서 묵시적으로 도출하도록 하는 경우, 상기 MV 및/또는 상기 BV를 연쇄적으로 참조하여 확정하는 체인 벡터(chained vector)를 사용하는 경우, 또는 그 밖에 임의의 유형의 예측 벡터를 어떠한 다른 방법에 의하여 사용하는 경우에 있어서도, 본 발명의 일 실시예에 기반한 방법이 동일 또는 유사하게 적용될 가능성을 본 발명은 배제하지 아니한다.However, the prediction vector shown in the present invention is not limited to the block vector (BV) for intra-screen prediction as described above, and the present invention can be applied to application technologies based on motion vectors (MV) for inter-screen prediction in the same or similar manner, and, as described below, it should be understood that it can also be utilized in cases where the BV and MV are used simultaneously. In addition, the present invention does not exclude the possibility that a method based on an embodiment of the present invention may be applied in the same or similar manner in cases where the MV and/or the BV are not explicitly defined in the bitstream syntax but are implicitly derived by the decoder, when a chained vector that is determined by sequentially referencing the MV and/or the BV is used, or when any type of prediction vector is used by any other method.

IBC 압축 방식은 화면 내 예측(intra prediction)임에도 불구하고, 복호화 대상 블록의 움직임 정보를 유도하여 움직임 정보를 전송하는 방법은 화면 간 예측(inter prediction) 방법과 유사하게 구성될 수 있다. 따라서, 본 발명의 일 실시예에 따르면, 상기 IBC 압축을 실행하는 방법은 상세하게 스킵/머지 모드(IBC skip/merge mode) 및 AMVP 모드(IBC amvp mode)로 구분될 수 있다. 상기 스킵/머지 모드 및 상기 AMVP 모드는 화면 간 예측에 있어서 움직임 벡터(MV; motion vector)를 부호화하는 경우에 있어서의 스킵(skip), 머지(merge), 및 AMVP 모드와 유사한 방법으로 상기 IBC에 관련된 정보를 부호화하고, 관련 정보를 전송하는 모드로 이해될 수 있다.Although the IBC compression method is an intra prediction method, the method of transmitting the motion information by deriving the motion information of the block to be decoded can be configured similarly to the inter prediction method. Therefore, according to one embodiment of the present invention, the method of executing the IBC compression can be specifically divided into a skip/merge mode (IBC skip/merge mode) and an AMVP mode (IBC amvp mode). The skip/merge mode and the AMVP mode can be understood as a mode for encoding information related to the IBC and transmitting the related information in a similar manner to the skip, merge, and AMVP modes when encoding a motion vector (MV) in inter prediction.

단, 본 발명에서 나타내는 예측 모드(prediction mode)란 상술된 예시에 따른 구분을 포함하면서 본 발명에 나타나는 예측 부호화/복호화 방법을 구현하는 데 사용될 수 있는 다양한 모드의 정의를 모두 포함하는 것으로 이해되어야 한다. 또한, 본 발명에서 편의상 예측 모드로 표현한 정보는 하나의 정보이거나 또는 복수의 정보로 구성될 수 있으며, 비트열에 하나의 구문 또는 복수의 구문으로 포함될 수도 있다. 예를 들어, 상기 IBC 스킵 모드, IBC 머지 모드, 및 상기 IBC AMVP 모드는 각각 단일한 비트열 구문 및/또는 단일한 정보 값에 기반하는 각각의 예측 모드로서 정의될 수 있으나, 또 다른 예를 들면, IBC를 사용하는지의 여/부를 나타내는 비트열 구문 및/또는 정보 값과 스킵/머지/AMVP 모드 중 어느 하나를 사용하는지를 나타내는 비트열 구문 및/또는 정보 값이 결합됨으로써 도출되는 예측 모드로서 정의되더라도 무방하다. 그 밖에 화면 내/화면 간 예측의 여부, 템플릿 매칭의 수행 여부, 그 밖에 다른 화면 내/화면 간 예측 모드와의 결합 실행 여부 등이 병합적으로 또는 독립적으로 예측 모드를 나타내는 비트열 구문 및/또는 정보 값을 나타내는 데 사용될 수 있음은 자명하다.However, it should be understood that the prediction mode indicated in the present invention includes all definitions of various modes that can be used to implement the prediction encoding/decoding method indicated in the present invention, including the distinction according to the above-described example. In addition, the information expressed as the prediction mode for convenience in the present invention may be a single piece of information or may be composed of multiple pieces of information, and may be included in a bit string as a single phrase or multiple phrases. For example, the IBC skip mode, the IBC merge mode, and the IBC AMVP mode may each be defined as prediction modes based on a single bit string phrase and/or a single information value, but as another example, the bit string phrase and/or information value indicating whether or not IBC is used may be combined with the bit string phrase and/or information value indicating whether or not one of the skip/merge/AMVP modes is used, thereby deriving the prediction mode. It is self-evident that other bit string syntax and/or information values indicating the prediction mode, such as whether intra-screen/inter-screen prediction is performed, whether template matching is performed, and whether combination with other intra-screen/inter-screen prediction modes is performed, can be used to represent the prediction mode, either merged or independently.

본 발명의 일 실시예에 따르면, 상기 IBC 부호화 방법을 위하여 마련될 수 있는 비트열 구문(bitstream syntax) 구조는 하기 예시하는 바와 같거나 또는 유사할 수 있다. 하기 표 1은 VVC 표준 또는 그로부터 파생된 부호화/복호화 규격을 위한 비트열 구문 구조에 있어서 SPS(sequence parameter set)의 구조 중 일부를 나타낸 것이다. 상기 SPS 구문 구조에 있어서, "sps_ibc_enabled_flag"로 표시되는 플래그 정보가 활성화되면, IBC 스킵/머지 후보 목록 (candidate list)의 최대 크기를 알리는 구문 정보인 "sps_six_minus_max_num_ibc_merge_cand"가 전송되도록 구성될 수 있다.According to one embodiment of the present invention, a bitstream syntax structure that can be provided for the IBC encoding method may be as exemplified below or similar thereto. Table 1 below shows a part of the structure of a sequence parameter set (SPS) in a bitstream syntax structure for the VVC standard or an encoding/decoding standard derived therefrom. In the SPS syntax structure, when the flag information indicated by "sps_ibc_enabled_flag" is activated, "sps_six_minus_max_num_ibc_merge_cand", which is syntax information indicating the maximum size of an IBC skip/merge candidate list, may be configured to be transmitted.

seq_parameter_set_rbsp(　) {seq_parameter_set_rbsp(　) { sps_ibc_enabled_flagsps_ibc_enabled_flag if(sps_ibc_enabled_flag)if(sps_ibc_enabled_flag) sps_six_minus_max_num_ibc_merge_cand sps_six_minus_max_num_ibc_merge_cand ……

하기 표 2는 VVC 표준 또는 그로부터 파생된 부호화/복호화 규격을 위한 비트열 구문 구조에 있어서 부호화 블록(CU, coding unit)의 구조 중 일부를 나타낸 것이다. 통상의 기술자는 상기 CU가 앞서 "현재 블록"등으로 나타낸 부호화 단위에 대응하는 블록 구조 중 하나에 해당함을 이해할 것이다. 상기 CU 구문 구조에 있어서, 해당하는 블록이 IBC 모드로 부호화되는지를 나타내는 구문 정보인 "pred_mode_ibc_flag" 및 상기 부호화 모드가 머지 모드인지를 나타내는 구문 정보인 "general_merge_flag"가 전송되도록 구성될 수 있다. 이 때, IBC 스킵/머지 모드가 적용되는 경우에는 IBC 머지 후보 목록 중 하나를 나타내기 위해 제공되는 "merge_idx" 구문 정보가 포함될 수 있다. 관련하여서는 하기 표 3에 나타나는 바와 같이 "merge_data"의 구문 정보 구조를 참조할 수 있다. 상기 구문 정보에 의하여 해당 블록의 예측 벡터, 예를 들어 BV를 정의하도록 구성될 수 있다. 한편, IBC AMVP 모드가 적용되는 경우에는 "mvd", "mvp", "avmr"을 포함하는 구문 정보가 포함될 수 있고, 상기 "merge_idx" 및/또는 상기 "mvd", "mvp", "avmr" 구문 정보는 각각 해당 블록의 예측 벡터, 예를 들어 BV를 정의하는 값으로 적용될 수 있다. 상기 예측 벡터 가운데 BV를 표현함에 있어서, 상기 "mvd", "mvp", "amvr"와 같은 구문은 각각 "bvd(BV difference)", "bvp(bv predictor)" 인덱스 및/또는 플래그, 그리고 "abvr"(adaptive BV resolution)" 인덱스와 같이 BV에 기준하는 구문 정보로서 식별되거나 포함될 수 있다. 즉, 본 명세서에서 "MV"를 포함하여 나타난 임의의 기재는 이를 "BV"로 교체하여 해석하더라도 명백한 기술적 충돌이 없고 본 발명의 기술적 사상에 준하는 한 그 지칭하는 대상이 혼용될 수 있는 것으로 이해되어야 한다.Table 2 below shows part of the structure of a coding block (CU, coding unit) in the bitstream syntax structure for the VVC standard or an encoding/decoding specification derived therefrom. A person skilled in the art will understand that the CU corresponds to one of the block structures corresponding to the coding unit previously indicated as a "current block", etc. In the CU syntax structure, "pred_mode_ibc_flag", which is syntax information indicating whether the corresponding block is encoded in IBC mode, and "general_merge_flag", which is syntax information indicating whether the encoding mode is merge mode, may be configured to be transmitted. In this case, when the IBC skip/merge mode is applied, "merge_idx" syntax information provided to indicate one of the IBC merge candidate lists may be included. In this regard, reference may be made to the syntax information structure of "merge_data" as shown in Table 3 below. The syntax information may be configured to define a prediction vector, for example, a BV, of the corresponding block. Meanwhile, when the IBC AMVP mode is applied, syntax information including "mvd", "mvp", and "avmr" may be included, and the "merge_idx" and/or the "mvd", "mvp", and "avmr" syntax information may be applied as values defining a prediction vector, for example, a BV, of the corresponding block, respectively. In expressing a BV among the prediction vectors, syntaxes such as the "mvd", "mvp", and "amvr" may be identified or included as syntax information based on the BV, such as the "bvd (BV difference)", "bvp (bv predictor)" index and/or flag, and the "abvr" (adaptive BV resolution)" index, respectively. That is, any description appearing in this specification including "MV" may be interpreted as being replaced with "BV", and it should be understood that the referred objects may be used interchangeably as long as there is no obvious technical conflict and it conforms to the technical spirit of the present invention.

상술한 IBC 스킵/머지 모드 및 IBC AMVP 모드는 모두 하나의 예측 벡터에 의존하는 단방향성 예측(uni-prediction)의 형태일 수 있다.Both the IBC skip/merge mode and the IBC AMVP mode described above can be forms of uni-prediction that rely on a single prediction vector.

coding_unit(x0,　y0,　cbWidth,　cbHeight,　cqtDepth,　treeType,　modeType) {coding_unit(x0,　y0,　cbWidth,　cbHeight,　cqtDepth,　treeType,　modeType) { if(((sh_slice_type==I && cu_skip_flag[x0][y0]==0) ||
(sh_slice_type!=I && (CuPredMode[chType][x0][y0]!=MODE_INTRA ||
(((cbWidth==4 && cbHeight==4) || modeType==MODE_TYPE_INTRA)
&& cu_skip_flag[x0][y0]==0)))) &&
cbWidth<=64 && cbHeight <= 64 && modeType != MODE_TYPE_INTER &&
sps_ibc_enabled_flag && treeType != DUAL_TREE_CHROMA ) if(((sh_slice_type==I && cu_skip_flag[x0][y0]==0) ||
(sh_slice_type!=I && (CuPredMode[chType][x0][y0]!=MODE_INTRA ||
(((cbWidth==4 && cbHeight==4) || modeType==MODE_TYPE_INTRA)
&& cu_skip_flag[x0][y0]==0)))) &&
cbWidth<=64 && cbHeight <= 64 && modeType != MODE_TYPE_INTER &&
sps_ibc_enabled_flag && treeType != DUAL_TREE_CHROMA ) pred_mode_ibc_flagpred_mode_ibc_flag …… } else if(treeType!=DUAL_TREE_CHROMA) { /* MODE_INTER or MODE_IBC */ } else if(treeType!=DUAL_TREE_CHROMA) { /* MODE_INTER or MODE_IBC */ if(cu_skip_flag[x0][y0]==0 ) if(cu_skip_flag[x0][y0]==0 ) general_merge_flag[x0][y0] general_merge_flag [x0][y0] if(general_merge_flag[x0][y0]) if(general_merge_flag[x0][y0]) merge_data(x0,　y0,　cbWidth,　cbHeight,　chType) merge_data (x0, y0, cbWidth, cbHeight, chType) else if(CuPredMode[chType][x0][y0]==MODE_IBC) { else if(CuPredMode[chType][x0][y0]==MODE_IBC) { mvd_coding(x0,　y0,　0,　0) mvd_coding (x0, y0, 0, 0) if(MaxNumIbcMergeCand>1)if(MaxNumIbcMergeCand>1) mvp_l0_flag[x0][y0] mvp_l0_flag [x0][y0] if(sps_amvr_enabled_flag &&
(MvdL0[x0][y0][0]!=0 || MvdL0[x0][y0][1]!=0)) if(sps_amvr_enabled_flag &&
(MvdL0[x0][y0][0]!=0 || MvdL0[x0][y0][1]!=0)) amvr_precision_idx[x0][y0] amvr_precision_idx [x0][y0] } else {} else { ……

merge_data(x0,　y0,　cbWidth,　cbHeight,　chType) {merge_data(x0,　y0,　cbWidth,　cbHeight,　chType) { if(CuPredMode[chType][x0][y0]==MODE_IBC) { if(CuPredMode[chType][x0][y0]==MODE_IBC) { if(MaxNumIbcMergeCand>1)if(MaxNumIbcMergeCand>1) merge_idx[x0][y0] merge_idx [x0][y0] } else {} else { ……

본 발명의 일 실시예에 따르면, 임의의 복호화 대상 블록의 예측 벡터(예를 들어, BV)의 값을 나타내기 위한 정밀도(precision)의 선택을 위해서는 적어도 두 가지 값 중 하나가 선택될 수 있으며, 해당 정보는 상기 표 2에 나타난 "amvr_precision_idx" 구문 정보에 의하여 전송되도록 구성될 수 있다. 또한, 본 발명의 일 실시예에 따르면, 상기 AMVR(adaptive motion vector resolution) 값은 BVD(block vector difference)에 대한 적응적인 해상도(resolution)를 나타내는 값에 해당할 수 있으며, "AmvrShift" 구문 정보를 이용하여 표현할 수 있다. 하기 표 4는 "AmvrShift" 값의 표현 및 그 의미에 관련하여 상세하게 나타낸다.According to one embodiment of the present invention, at least one of two values may be selected to select a precision for indicating a value of a prediction vector (e.g., BV) of an arbitrary decoding target block, and the corresponding information may be configured to be transmitted by the "amvr_precision_idx" syntax information shown in Table 2 above. In addition, according to one embodiment of the present invention, the AMVR (adaptive motion vector resolution) value may correspond to a value indicating an adaptive resolution for a BVD (block vector difference), and may be expressed using "AmvrShift" syntax information. Table 4 below shows in detail the expression of the "AmvrShift" value and its meaning.

amvr_amvr_
flagflag amvr_amvr_
precision_precision_
idxidx AmvrShiftAmvrShift inter_affine_flag==1inter_affine_flag==1 CuPredMode[chType][x0][y0]==MODE_IBC)CuPredMode[chType][x0][y0]==MODE_IBC) inter_affine_flag==0 && CuPredMode[chType][x0][y0]!=MODE_IBCinter_affine_flag==0 && CuPredMode[chType][x0][y0]!=MODE_IBC 00 -- 2 (1/4 luma sample)2 (1/4 luma sample) -- 2 (1/4 luma sample)2 (1/4 luma sample) 11 00 0 (1/16 luma sample)0 (1/16 luma sample) 4 (1 luma sample)4 (1 luma sample) 3 (1/2 luma sample)3 (1/2 luma sample) 11 11 4 (1 luma sample)4 (1 luma sample) 6 (4 luma samples)6 (4 luma samples) 4 (1 luma sample)4 (1 luma sample) 11 22 -- -- 6 (4 luma samples)6 (4 luma samples)

상기 표 4를 참조하면, 복호화 대상 블록이 IBC 모드로 부호화되었을 때, "AmvrShift" 값은 1화소 또는 4화소 값을 가질 수 있다. 즉, 블록의 움직임을 나타내는 예측 벡터, 예를 들어 BV의 표현 단위가 1화소 단위 또는 4화소 단위 중 적어도 하나일 수 있다는 의미이다. 실시예에 따라서는, 상기와 같은 표현 방법은 기존의 실사 영상에 비해 컴퓨터 그래픽 처리된 영상의 경우 부화소(sub-pixel) 단위의 정밀도로 예측 벡터의 움직임 정보를 표현할 필요가 없기 때문일 수 있다. 물론, 본 발명의 다양한 실시예에 따라서는 더 정밀한 예측 벡터의 정보 표현 해상도가 채택될 수 있으며, 그러한 변형 실시의 가능성을 본 발명은 한정하지 아니한다.Referring to Table 4 above, when the block to be decoded is encoded in IBC mode, the "AmvrShift" value can have a 1-pixel or 4-pixel value. That is, it means that the expression unit of the prediction vector representing the motion of the block, for example, BV, can be at least one of a 1-pixel unit or a 4-pixel unit. Depending on the embodiment, the above expression method may be because, compared to existing real-life images, there is no need to express the motion information of the prediction vector with sub-pixel precision in the case of computer graphic processed images. Of course, according to various embodiments of the present invention, a more precise information expression resolution of the prediction vector may be adopted, and the present invention does not limit the possibility of such modified implementation.

본 발명의 일 실시예에 따르면, 본 발명은 스크린 컨텐츠에 유용한 부호화 및 복호화 방법을 제공하도록 구성될 수 있다. 복호화 대상 블록이 IBC 모드로 부호화 되었을 때, 예측 벡터, 예를 들어 BV의 정보를 산정 및/또는 전송하는 방법을 제공하고, 또한 복호화 과정 중에 해당 블록 벡터를 재정의하는 방법을 제공할 수 있다.According to one embodiment of the present invention, the present invention can be configured to provide a method for encoding and decoding useful for screen content. When a block to be decoded is encoded in IBC mode, the present invention provides a method for calculating and/or transmitting information about a prediction vector, e.g., a block vector (BV), and also provides a method for redefining the corresponding block vector during the decoding process.

기존의 영상 부호화/복호화 기술은 화면 간 예측 정보인 움직임 벡터(MV, motion vector)에 비해 화면 내 예측 정보인 BV 값을 덜 정교하게 산정, 전송, 복원하는 방식을 취했다. 예컨대, 일반 화면 간 예측의 경우, MV 값을 1/4 화소 정밀도으로 표현하고, "affine" 방법의 경우 MV 값을 1/16 화소 정밀도로 표현하였다면, 화면 내 예측 정보인 BV 값은 1화소 또는 4화소 단위 정밀도로 표현하였다. 본 발명은 다양한 실시예를 통해 스크린 컨텐츠의 부호화 방법인 IBC의 BV를 더 정교하게 표현하는 방법을 제공하는 데 활용될 수 있다. 실시예에 따라서는, 본 발명은 상기 BV의 정확도를 높이기 위해 양방향 예측성(bi-predictive) BV 정보를 사용하도록 구성될 수도 있다. Conventional video encoding/decoding technologies have calculated, transmitted, and restored BV values, which are intra-screen prediction information, less precisely than motion vectors (MVs), which are inter-screen prediction information. For example, in the case of general inter-screen prediction, MV values are expressed with 1/4 pixel precision, and in the case of the "affine" method, MV values are expressed with 1/16 pixel precision, while BV values, which are intra-screen prediction information, are expressed with 1 pixel or 4 pixel precision. The present invention can be utilized to provide a method for more precisely expressing BV of IBC, which is a method of encoding screen contents, through various embodiments. Depending on the embodiment, the present invention may be configured to use bi-predictive BV information to increase the accuracy of the BV.

본 발명의 일 실시예에 따르면, IBC 부호화 방법은 IBC 스킵/머지 모드, IBC MBVD(merge with block vector difference) 모드, IBC BVP(block vector prediction) 모드를 포함하는 다수의 모드로 세분화할 수 있다. 상기 스킵/머지 모드의 경우, BV에 대한 후보 인덱스(candidate index) 값을 이용하여 상기 BV 값을 전송하고, 상기 MBVD 모드의 경우, 상기 후보 인덱스 값에 추가적으로 "bvd"(BV difference) 값을 전송하도록 구성될 수 있고, 상기 BVP 모드의 경우, "bvd"(BV difference), "bvp"(BV predictor) 인덱스, 및 "abvr"(adaptive BV resolution)을 포함하는 개별 값에 기반하여 상기 BV 값을 전송하도록 구성될 수 있다.According to one embodiment of the present invention, the IBC encoding method can be subdivided into a number of modes including an IBC skip/merge mode, an IBC MBVD (merge with block vector difference) mode, and an IBC BVP (block vector prediction) mode. In the skip/merge mode, the BV value can be transmitted using a candidate index value for the BV, and in the MBVD mode, a "bvd" (BV difference) value can be transmitted in addition to the candidate index value, and in the BVP mode, the BV value can be transmitted based on individual values including a "bvd" (BV difference), a "bvp" (BV predictor) index, and an "abvr" (adaptive BV resolution).

제1실시예First embodiment

본 발명의 일 실시예에 따르면, IBC BVP 모드를 적용할 때에 있어서, 대상 블록의 예측 벡터(예를 들어 BV)의 정밀도(precision)에 관련된 정보를 부화소(sub-pixel) 단위를 나타내는 방법으로 전송할 수 있다. 본 발명의 명세서에 있어서, 상기 부화소(sub-pixel)란 정수 화소(integer-pixel) 보다 작은 1/2화소, 1/4화보, 1/8화소, 1/16화소, 1/32화소, 및 상술한 예에 한정되지 않는 1화소 미만의 정밀도/해상도를 나타내는 용어로 이해될 수 있다. 또한 본 발명의 명세서에 있어서 상기 부화소란 다른 명칭에 의하여 실질적으로 동일한 개념으로 나타나는 모든 용어와 호환되는 것으로 간주되어야 하며, 예를 들어 서브-펠(sub-pel), 분할 픽셀(fractional pixel), 프랙셔널-펠(fractional-pel)과 같은 용어들과 등가인 것으로 간주될 수 있다.According to one embodiment of the present invention, when applying the IBC BVP mode, information related to the precision of the prediction vector (e.g., BV) of the target block can be transmitted in a method indicating a sub-pixel unit. In the specification of the present invention, the sub-pixel can be understood as a term indicating a precision/resolution of less than 1/2 pixel, 1/4 picture, 1/8 pixel, 1/16 pixel, 1/32 pixel, and less than 1 pixel that is not limited to the above-described examples, which are smaller than an integer pixel. In addition, in the specification of the present invention, the sub-pixel should be considered to be compatible with all terms that indicate substantially the same concept by other names, and can be considered equivalent to terms such as sub-pel, fractional pixel, and fractional-pel, for example.

상술한 실시예에 따른 방법에 의하면, 기존의 정수 화소 정밀도(예를 들어, 1화소 또는 4화소 단위)보다 더 정교하게 예측 벡터의 정밀도를 표현할 수 있으며, 예를 들어 1/2화소 및/또는 1/4화소 수준으로 예측 벡터(즉, BV)의 값을 전송할 수 있는 방법이 제공될 수 있다. According to the method according to the above-described embodiment, a method can be provided that can express the precision of a prediction vector more precisely than the existing integer pixel precision (e.g., 1 pixel or 4 pixel unit), and can transmit the value of a prediction vector (i.e., BV) at the level of, for example, 1/2 pixel and/or 1/4 pixel.

본 발명의 일 실시예에 따르면, 적응적 BV 해상도(ABVR; Adaptive block vector resolution)를 나타내는 비트열 구문 플래그 값인 "abvr_flag"를 제공할 수 있으며, 상기 "abvr_flag"의 값이 거짓으로 판단되면("No", "N", "False", "0" 등을 포함하는 방법으로 나타날 수 있다), BV가 1화소 단위의 정밀도를 가지고, 상기 값이 참으로 판단되면("Yes", "Y", "True", "1" 등을 포함하는 방법으로 나타날 수 있다), BV가 다수의 해상도에 의하여 표현될 수 있으며, 이 때 상기 해상도에 관련된 정보는 "abvr_precision_idx" 값으로 표현되도록 구성될 수 있다. 실시예에 따라서는, 상기 "abvr_flag" 플래그 값이 거짓으로 판단되는 경우에도 기본 정밀도 값으로 부화소(예를 들어, 1/4화소) 단위 정밀도를 가지도록 구성될 수도 있다.According to one embodiment of the present invention, an "abvr_flag" bit string syntax flag value indicating an adaptive block vector resolution (ABVR) may be provided, and if the value of the "abvr_flag" is determined to be false (which may be expressed in a manner including "No", "N", "False", "0", etc.), the BV has a precision of 1 pixel unit, and if the value is determined to be true (which may be expressed in a manner including "Yes", "Y", "True", "1", etc.), the BV can be expressed by a plurality of resolutions, and in this case, information related to the resolution may be configured to be expressed as an "abvr_precision_idx" value. Depending on the embodiment, even if the "abvr_flag" flag value is determined to be false, it may be configured to have a precision of a sub-pixel (e.g., 1/4 pixel) unit as a basic precision value.

이하 표 5 내지 표 17은 "abvr_flag" 값 및 "abvr_precision_idx" 값의 조합에 의하여 표현되는 예측 벡터 정밀도, 즉 "AbvrShift" 값의 구성방법에 대한 다양한 실시예들을 나타낸다. 하기 예시되는 실시예들은 상기 값을 조합하여 구현할 수 있는 다양한 방법에 대한 예시로써, 본 발명의 구현 방법은 하기 실시예에 한정되지 아니하고 하기 실시예를 참조하여 그로부터 착안 또는 파생될 수 있는 다양한 조합을 포함할 수 있다. 즉, 하기 각각의 실시예들은 상기 "abvr_flag"의 값이 특정 범위에 속하는 경우에 "abvr_precision_idx" 값을 읽어들이고 그에 따라 예측 벡터의 정밀도를 결정하도록 하는 과정으로 일반화될 수 있다.Tables 5 to 17 below illustrate various embodiments of a method for configuring a prediction vector precision, i.e., an "AbvrShift" value, expressed by a combination of the "abvr_flag" value and the "abvr_precision_idx" value. The embodiments exemplified below are examples of various methods that can be implemented by combining the above values, and the implementation method of the present invention is not limited to the embodiments below and may include various combinations that can be inspired or derived from the embodiments below. That is, each of the embodiments below can be generalized to a process of reading the "abvr_precision_idx" value when the value of the "abvr_flag" falls within a specific range and determining the precision of the prediction vector accordingly.

abvr_flagabvr_flag abvr_precision_idxabvr_precision_idx AbvrShiftAbvrShift 00 -- 4 (1 luma sample)4 (1 luma sample) 11 00 4 (1 luma sample)4 (1 luma sample) 11 11 6 (4 luma samples)6 (4 luma samples) 11 22 2 (1/4 luma sample)2 (1/4 luma sample)

abvr_flagabvr_flag abvr_precision_idxabvr_precision_idx AbvrShiftAbvrShift 00 -- 4 (1 luma sample)4 (1 luma sample) 11 00 4 (1 luma sample)4 (1 luma sample) 11 11 6 (4 luma samples)6 (4 luma samples) 11 22 3 (1/2 luma sample)3 (1/2 luma sample)

abvr_flagabvr_flag abvr_precision_idxabvr_precision_idx AbvrShiftAbvrShift 00 -- 4 (1 luma sample)4 (1 luma sample) 11 00 4 (1 luma sample)4 (1 luma sample) 11 11 6 (4 luma samples)6 (4 luma samples) 11 22 2 (1/4 luma sample)2 (1/4 luma sample) 11 33 3 (1/2 luma sample)3 (1/2 luma sample)

abvr_flagabvr_flag abvr_precision_idxabvr_precision_idx AbvrShiftAbvrShift 00 -- 4 (1 luma sample)4 (1 luma sample) 11 00 6 (4 luma samples)6 (4 luma samples) 11 11 2 (1/4 luma sample)2 (1/4 luma sample)

abvr_flagabvr_flag abvr_precision_idxabvr_precision_idx AbvrShiftAbvrShift 00 -- 4 (1 luma sample)4 (1 luma sample) 11 00 2 (1/4 luma samples)2 (1/4 luma samples) 11 11 6 (4 luma sample)6 (4 luma samples)

abvr_flagabvr_flag abvr_precision_idxabvr_precision_idx AbvrShiftAbvrShift 00 -- 2 (1/4 luma sample)2 (1/4 luma sample) 11 00 4 (1 luma samples)4 (1 luma samples) 11 11 6 (4 luma sample)6 (4 luma samples)

abvr_flagabvr_flag abvr_precision_idxabvr_precision_idx AbvrShiftAbvrShift 00 -- 4 (1 luma sample)4 (1 luma sample) 11 00 6 (4 luma samples)6 (4 luma samples) 11 11 3 (1/2 luma sample)3 (1/2 luma sample)

abvr_flagabvr_flag abvr_precision_idxabvr_precision_idx AbvrShiftAbvrShift 00 -- 4 (1 luma sample)4 (1 luma sample) 11 00 3 (1/2 luma samples)3 (1/2 luma samples) 11 11 6 (4 luma sample)6 (4 luma samples)

abvr_flagabvr_flag abvr_precision_idxabvr_precision_idx AbvrShiftAbvrShift 00 -- 3 (1/2 luma sample)3 (1/2 luma sample) 11 00 4 (1 luma samples)4 (1 luma samples) 11 11 6 (4 luma sample)6 (4 luma samples)

abvr_flagabvr_flag abvr_precision_idxabvr_precision_idx AbvrShiftAbvrShift 00 -- 4 (1 luma sample)4 (1 luma sample) 11 00 6 (4 luma samples)6 (4 luma samples) 11 11 2 (1/4 luma sample)2 (1/4 luma sample) 11 22 3 (1/2 luma sample)3 (1/2 luma sample)

abvr_flagabvr_flag abvr_precision_idxabvr_precision_idx AbvrShiftAbvrShift 00 -- 4 (1 luma sample)4 (1 luma sample) 11 00 2 (1/4 luma sample)2 (1/4 luma sample) 11 11 6 (4 luma samples)6 (4 luma samples) 11 22 3 (1/2 luma sample)3 (1/2 luma sample)

abvr_flagabvr_flag abvr_precision_idxabvr_precision_idx AbvrShiftAbvrShift 00 -- 2 (1/4 luma sample)2 (1/4 luma sample) 11 00 4 (1 luma sample)4 (1 luma sample) 11 11 6 (4 luma samples)6 (4 luma samples) 11 22 3 (1/2 luma sample)3 (1/2 luma sample)

abvr_flagabvr_flag abvr_precision_idxabvr_precision_idx AbvrShiftAbvrShift 00 00 2 (1/4 luma sample)2 (1/4 luma sample) 00 11 3 (1/2 luma sample)3 (1/2 luma sample) 11 00 4 (1 luma sample)4 (1 luma sample) 11 11 6 (4 luma samples)6 (4 luma samples)

상술한 바와 같이, 본 명세서를 통하여 개시되는 실시예에서 제시하는 "abvr_flag", "abvr_precision_idx", 및 "AbvrShift" 값의 상세수치 및 순서는 본 발명의 설명을 위한 편의상 일례로 제시된 것으로, 본 발명에 의하면 상술된 표에 나타난 형태에 한정하지 않고 다양한 방법에 의하여 BV의 해상도를 부화소 단위로 산정하고 전송하는 방법을 적용할 수 있음을 용이하게 이해할 수 있을 것이다.As described above, the detailed values and order of the "abvr_flag", "abvr_precision_idx", and "AbvrShift" values presented in the embodiments disclosed through this specification are presented as examples for the convenience of explaining the present invention, and it will be easily understood that according to the present invention, a method of calculating and transmitting the resolution of BV in sub-pixel units by various methods can be applied without being limited to the form shown in the above-described table.

제2실시예Second embodiment

본 발명의 일 실시예에 따르면, 예측 벡터, 예를 들어 BV의 정보를 복호화 과정 중에 재정의하는 방법을 적용할 수 있다. 상술한 실시예와 같이 예측 벡터의 정밀도 값을 부화소(sub-pixel) 단위로 산정하여 전송하게 되면, 해당 정보를 표현하는 비트 정보량이 늘어나는 단점이 발생할 수 있다. 이를 극복하기 위해, 본 발명의 일 실시예에 따르면, 예측 벡터(BV)의 산정 및 전송 시에는 정수 화소(integer-pixel) 단위로 벡터 값을 처리하고, 부호화기/복호화기가 복호화 과정 도중에 이를 부화소(sub-pixel) 단위로 재정의(refinement)하는 방법을 적용할 수 있다.According to one embodiment of the present invention, a method of redefining information of a prediction vector, for example, BV, during a decoding process can be applied. If the precision value of the prediction vector is calculated and transmitted in sub-pixel units as in the above-described embodiment, a disadvantage may occur in that the amount of bit information expressing the corresponding information increases. To overcome this, according to one embodiment of the present invention, a method can be applied in which the vector value is processed in integer-pixel units when calculating and transmitting the prediction vector (BV), and the encoder/decoder refines it in sub-pixel units during the decoding process.

본 발명의 실시예는 상술한 바와 같이 정수 화소로 산정 및/또는 전송하고 이를 부화소 단위로 재정의하는 것에 한정되지 아니하며, 그 밖의 정밀도 단위나 방법에 의하여 산정/전송되는 시점의 예측 벡터 정밀도보다 더 정교한 정밀도 단위로 복호화 과정 도중에 보상 교정(compensation)하는 일체의 방법을 포함하는 것으로 이해되어야 할 것이다. 예를 들어, 1/2화소 단위로 BV 값을 전송하고 1/4화소 단위로 보상 교정하는 방법 또한 본 발명의 일 실시예에 포함되는 것으로 보아야 할 것이다.하기 표 18은 본 발명의 일 실시예에 따른 단일 정밀도에 의한 예측 벡터의 전송 방법을 나타낸다. 하기 표 18과 같이, 1화소 또는 4화소 중 하나의 정밀도 값에 기준하여 예측 벡터(예를 들어, BV)를 전송할 수 있다.The embodiment of the present invention is not limited to calculating and/or transmitting as integer pixels and redefining them in sub-pixel units as described above, and should be understood to include any method of compensating during the decoding process with a precision unit more precise than the prediction vector precision at the time of calculation/transmission by other precision units or methods. For example, a method of transmitting a BV value in units of 1/2 pixels and compensating for it in units of 1/4 pixels should also be considered to be included in an embodiment of the present invention. Table 18 below shows a method of transmitting a prediction vector by single precision according to an embodiment of the present invention. As shown in Table 18 below, a prediction vector (e.g., BV) can be transmitted based on a precision value of either 1 pixel or 4 pixels.

abvr_precision_flagabvr_precision_flag AbvrShiftAbvrShift 00 4 (1 luma sample)4 (1 luma sample) 11 6 (4 luma samples)6 (4 luma samples)

본 발명의 일 실시예에 따르면, 예측 벡터, 예를 들어 BV의 값을 재정의할 때, 템플릿 매칭(template matching; TM) 방법을 사용할 수 있다. 상기 TM은 부호화/복호화 대상 블록(즉, 현재 블록)의 인접한 좌측 및/또는 상측 일정 영역을 현재 템플릿(template)으로 설정하고, 참조 블록(reference block)의 인접한 좌측 및/또는 상측 일정 영역을 상응하는 참조 템플릿으로 설정한 후, 현재 블록의 템플릿과 참조 블록의 템플릿 간의 차분(error) 값을 측정하여, 가장 최소의 차분 값을 가진 참조 블록(또는, 예측 블록)을 선정하는 과정을 의미할 수 있다. 여기서, 상기 참조 블록의 탐색 범위(search range)는 현재 픽쳐 내 미리 정의된 크기의 복원 영역에 한정하며, 상기 템플릿에 속하는 화소의 차분 값은 SAD(Sum of Absolute Differences), MSE(Mean Squared Error) 등의 계산법을 이용하여 측정하도록 구성될 수 있다.According to one embodiment of the present invention, when redefining the value of a prediction vector, for example, BV, a template matching (TM) method may be used. The TM may refer to a process of setting an adjacent left and/or upper certain area of a block to be encoded/decoded (i.e., a current block) as a current template, setting an adjacent left and/or upper certain area of a reference block as a corresponding reference template, and then measuring an error value between the template of the current block and the template of the reference block to select a reference block (or a prediction block) with the smallest error value. Here, the search range of the reference block may be limited to a reconstructed area of a predefined size within the current picture, and the error value of a pixel belonging to the template may be configured to be measured using a calculation method such as SAD (Sum of Absolute Differences) or MSE (Mean Squared Error).

도 8은 본 발명의 일 실시예에 따른 템플릿 매칭에 기반한 예측 벡터 재정의에 대한 개념도이다. 예컨대, 도 8을 참조하면, 현재 블록(810) 주변의 소정 영역을 현재 템플릿(815)영역으로 정의할 수 있다. 다음으로, 동일한 픽쳐 내 복원된 영역(890) 중에서 참조 블록(820)을 복수 탐색하여, 그 중 주변의 참조 템플릿 영역(825)이 상기 현재 템플릿(815)와 가장 유사하게 매칭되는 블록을 검색하여, 이를 최종 참조 블록(820)으로 정의할 수 있다. 상술한 바와 같이 찾아진 참조 블록(820)의 위치를 현재 블록(810) 기준의 예측 벡터, 즉 BV(830)를 결정하는 데에 사용할 수 있다. 이 때, 실시예에 따라서는, 현재 블록의 예측 벡터(830) 값으로 재정의된 벡터 값을 이후에 부호화/복호화되는 블록들을 위하여 메모리에 저장하여 활용하도록 구성될 수 있다. 상기 저장된 예측 벡터는 후속 블록들의 예측 벡터를 결정하는 과정에 활용될 수 있으며, 예를 들어, HMVP(History-based Motion Vector Prediction) 방식과 유사한 방식으로 활용될 수 있다.FIG. 8 is a conceptual diagram for redefining a prediction vector based on template matching according to an embodiment of the present invention. For example, referring to FIG. 8, a predetermined area around a current block (810) may be defined as a current template (815) area. Next, a plurality of reference blocks (820) may be searched for among reconstructed areas (890) within the same picture, and a block whose surrounding reference template area (825) most similarly matches the current template (815) may be searched for, and this may be defined as a final reference block (820). As described above, the position of the found reference block (820) may be used to determine a prediction vector, i.e., BV (830), based on the current block (810). In this case, depending on the embodiment, the vector value redefined as the prediction vector (830) value of the current block may be configured to be stored in memory and utilized for blocks to be encoded/decoded thereafter. The above-mentioned stored prediction vector can be utilized in the process of determining the prediction vector of subsequent blocks, and can be utilized in a manner similar to, for example, the HMVP (History-based Motion Vector Prediction) method.

도 9는 본 발명의 일 실시예에 따른 예측 벡터의 재정의 과정을 나타내는 흐름도이다. 도 9를 참조하면, 먼저 부호화/복호화 대상 블록에 대한 예측 벡터(bv 등)의 해상도를 나타내는 비트열 구문 정보인 "abvr_precision_flag" 값이 파싱될 수 있다(S910). 다음으로, 상기 예측 벡터 재정의(refinement)의 기능을 허용하는지의 여부를 나타내는 정보인 "BV_refinement_enabled_flag" 값을 확인할 수 있다(S920). 상기 "abvr_precision_flag"는 정의된 해상도/정밀도 값이 3개 이상인 경우, 인덱스(index)의 형태로 전송될 수 있다. 본 발명은 상기 해상도/정밀도 값이 지정되는 개수를 한정하지 아니한다.FIG. 9 is a flowchart illustrating a process of redefining a prediction vector according to an embodiment of the present invention. Referring to FIG. 9, first, the "abvr_precision_flag" value, which is bit string syntax information indicating the resolution of a prediction vector (such as bv) for a block to be encoded/decoded, may be parsed (S910). Next, the "BV_refinement_enabled_flag" value, which is information indicating whether the function of redefining the prediction vector is allowed, may be checked (S920). The "abvr_precision_flag" may be transmitted in the form of an index when there are three or more defined resolution/precision values. The present invention does not limit the number of resolution/precision values specified.

다음으로, "BV_refinement_enabled_flag" 값이 참으로 판단되면("Yes", "Y", "True", "1" 등을 포함하는 방법으로 나타날 수 있다)(S920), 대상 블록의 예측 벡터 재정의를 수행할지의 여부를 나타내는 비트열 구문 정보인 "BV_refinement_flag" 값이 파싱될 수 있다(S930). 이 때, 상기 "bv_refinement_enabled_flag"는 고수준 비트열 구문(high-level syntax; HLS)으로써, SPS(sequence parameter set), PPS(picture parameter set), PH(picture header), SH(slice header)를 포함하는 헤더 영역 중 적어도 하나의 위치에서 획득될 수 있다.Next, if the "BV_refinement_enabled_flag" value is determined to be true (which may be expressed in a manner including "Yes", "Y", "True", "1", etc.) (S920), the "BV_refinement_flag" value, which is bit string syntax information indicating whether to perform prediction vector redefinition of the target block, may be parsed (S930). At this time, the "bv_refinement_enabled_flag" may be obtained from at least one location among a header area including a sequence parameter set (SPS), a picture parameter set (PPS), a picture header (PH), and a slice header (SH), as a high-level syntax (HLS).

상기 "BV_refinement_flag" 값이 참으로 판단되면("Yes", "Y", "True", "1" 등을 포함하는 방법으로 나타날 수 있다)(S940), 대상 블록의 "BV precision" 값으로 전송된 "abvr_precision_flag"에 따라 결정될 수 있는 예측 벡터의 정밀도보다 보다 정교한 단위의 정밀도에 기반하여 상기 TM 과정을 실행하여(S950) 그 결과에 따라 예측 벡터, 즉 BV의 값을 재정의(S960)할 수 있다. 예를 들어, 전송된 "bv precision"이 1화소 단위일 때 상기 TM 과정을 통해 1/4화소 단위의 정밀도로 상기 예측 벡터의 값을 재정의할 수 있다.If the above "BV_refinement_flag" value is determined to be true (which may be expressed in a manner including "Yes", "Y", "True", "1", etc.) (S940), the TM process may be executed based on a precision in a unit more refined than the precision of the prediction vector that can be determined according to the "abvr_precision_flag" transmitted as the "BV precision" value of the target block (S950), and the value of the prediction vector, i.e., BV, may be redefined based on the result (S960). For example, when the transmitted "bv precision" is in 1-pixel units, the value of the prediction vector may be redefined in 1/4-pixel units through the TM process.

본 발명의 다른 일 실시예에 따르면, 상기 "BV_refinement_enabled_flag"를 판단하는 과정(S920)을 생략하고, 대상 블록에 대한 예측 벡터 재정의를 수행할지를 나타내는 "BV_refinement_flag" 값의 파싱(S930)이 이루어질 수 있다.According to another embodiment of the present invention, the process of determining the "BV_refinement_enabled_flag" (S920) may be omitted, and parsing of the "BV_refinement_flag" value indicating whether to perform prediction vector redefinition for the target block (S930) may be performed.

본 발명의 또다른 일 실시예에 따르면, 상기 "BV_refinement_enabled_flag"를 판단하는 과정(S920) 내지 상기 "BV_refinement_flag"를 판단하는 과정(S940)에 이르는 일련의 과정을 생략하여, 대상 블록에 대한 예측 벡터의 재정의 수행 여부를 묻지 않고 바로 TM 과정(S950)으로 이행하여 상기 예측 벡터 값을 재정의(S960)하도록 구성될 수 있다.According to another embodiment of the present invention, a series of processes from the process of determining the "BV_refinement_enabled_flag" (S920) to the process of determining the "BV_refinement_flag" (S940) can be omitted, and the process can be configured to proceed directly to the TM process (S950) and redefine the prediction vector value (S960) without asking whether to redefine the prediction vector for the target block.

본 발명의 일 실시예에 따르면, 전송되는 예측 벡터(BV)의 값을 초기 참조 블록의 위치 값으로 설정하여, 초기 참조 블록을 주변으로 TM 과정을 수행하고, 이를 통해 최종적으로 갱신된 벡터 값을 구하게 되며, 이렇게 구해진 상기 벡터 값을 대상 블록의 최종 예측 벡터(BV) 값으로 사용하도록 구성될 수 있다. 상기 TM 과정을 수행하는 중에 부화소(sub-pixel) 단위까지 예측 벡터(BV)에 대한 보상 교정(compensation) 과정을 거치도록 구성될 수 있으며, 그 결과로 결정된 상기 예측 벡터 값은 부화소 단위의 정밀도를 가질 수 있게 된다. 이러한 벡터 재정의(refinement) 과정을 통하여 예측 벡터(BV)의 정교함이 증가할 수 있다. 또한, 해당 과정에서 전송된 상기 벡터 값으로 초기 위치가 정해져서 대상 블록에서 참조 블록을 확정하기 위한 검색 범위(search range)가 한정되어, 복호화기의 복잡도를 줄일 수 있는 효과를 제공할 수 있다.According to one embodiment of the present invention, the value of the transmitted prediction vector (BV) is set to the position value of the initial reference block, a TM process is performed around the initial reference block, and a final updated vector value is obtained through this, and the vector value obtained in this way can be configured to be used as the final prediction vector (BV) value of the target block. During the TM process, a compensation process for the prediction vector (BV) can be performed down to the sub-pixel unit, and the prediction vector value determined as a result can have precision in the sub-pixel unit. Through this vector refinement process, the precision of the prediction vector (BV) can be increased. In addition, since the initial position is determined by the vector value transmitted in the process, the search range for confirming the reference block in the target block is limited, thereby providing an effect of reducing the complexity of the decoder.

상술하는 실시예에 따른 개량된 방법은 IBC 스킵/머지 모드, IBC MBVD 모드, 및 IBC BVP 모드 등에 공히 적용 가능함은 용이하게 이해할 수 있을 것이다.It will be readily understood that the improved method according to the above-described embodiment is applicable to IBC skip/merge mode, IBC MBVD mode, and IBC BVP mode.

제3실시예Third embodiment

본 발명의 일 실시예에 따르면, 예측 벡터, 예를 들어 BV의 정보를 복호화 과정 중에 재정의하는 방법이 적용될 수 있다. 실시예에 따르면, 대상 블록의 예측 벡터(BV) 값이 전송되면, 복호화기는 대상 블록을 미리 정의된 서브 블록(sub-block) 사이즈로 분할한 후, 상기 각각의 서브 블록에 대한 벡터 값을 재정의(refinement) 및/또는 보상 교정(compensation)하도록 구성될 수 있다. 예컨대, 대상 블록이 32x16 인 경우, 8개의 8x8 단위 서브 블록으로 분할한 후, 상기 각각의 8x8 서브 블록 별로 상기 벡터 값을 재정의할 수 있다. 단, 상기 서브 블록의 분할에 대한 구체적인 방법은 본 발명에 의하여 한정되지 아니한다. 예를 들어, 상기 서브 블록의 크기는 8x8로 한정되지 아니한다. 본 발명에 있어서는, 부호화 대상 블록을 해당 블록보다 작은 크기로 분할할 수 있는 모든 소정 화소 단위를 상기 서브 블록에 해당하는 것으로 간주할 수 있다. 앞서 다른 실시예의 설명을 통하여 설명한 바와 유사하게, 상기 서브 블록 별로 BV 값을 재정의할 때에는 템플릿 매칭(TM)을 사용하는 방법이 이용될 수 있다.According to one embodiment of the present invention, a method of redefining information of a prediction vector, for example, BV, during a decoding process may be applied. According to the embodiment, when a prediction vector (BV) value of a target block is transmitted, the decoder may be configured to divide the target block into sub-blocks of a predefined size, and then refine and/or compensate for the vector value for each sub-block. For example, if the target block is 32x16, it may be divided into eight 8x8 unit sub-blocks, and then the vector value may be redefined for each 8x8 sub-block. However, the specific method for dividing the sub-blocks is not limited by the present invention. For example, the size of the sub-block is not limited to 8x8. In the present invention, all predetermined pixel units that can divide the encoding target block into a size smaller than the corresponding block may be regarded as corresponding to the sub-block. Similar to what has been explained through the description of other embodiments above, a method using template matching (TM) can be used when redefining the BV value for each sub-block.

도 10은 본 발명의 일 실시예에 따른 서브 블록 기반의 예측 벡터 재정의에 대한 개념도이다. 도 10을 참조하면, 예시적으로 32x16 화소 크기의 대상 블록(1010)을 8개의 8x8 서브 블록으로 분할한 형태가 도시되어 있다. 상기 각각의 서브 블록 단위로 예측 벡터(BV) 값을 보상 교정(compensation)하는 방법에 대하여 예시적으로 설명한다. 상기 대상 블록(1010)의 초기 BV 값(1030)에 기반하여, 상기 초기 BV 값이 지목하는 가상의 참조 블록(1020)을 기준으로 그에 속하는 각각의 8x8 서브 블록들에 대하여 BV 값을 재정의할 수 있다.Fig. 10 is a conceptual diagram for redefining a sub-block-based prediction vector according to an embodiment of the present invention. Referring to Fig. 10, a target block (1010) having a size of 32x16 pixels is illustrated as being divided into eight 8x8 sub-blocks. A method for compensating a prediction vector (BV) value for each sub-block unit is exemplarily described. Based on an initial BV value (1030) of the target block (1010), the BV value can be redefined for each of the 8x8 sub-blocks belonging to a virtual reference block (1020) indicated by the initial BV value.

도 10에 도시되는 바와 같이, 첫번째 8x8 서브 블록(1011)의 제1 서브 템플릿(1016)과 가장 유사한 템플릿을 TM 과정을 통하여 탐색할 수 있다. 상기 탐색을 통하여 대응하는 제1 참조 템플릿(1026)이 찾아지면, 상기 첫번째 8x8 서브 블록(1011)의 BV 값을 보상 교정한 결과에 따라(1031) 제1 서브 참조 블록(sub-reference-block)(1021)을 찾을 수 있다. 유사하게, 두번째 8x8 서브 블록(1012)의 제2 서브 템플릿(1017)과 가장 유사한 템플릿을 TM 과정을 통하여 탐색할 수 있다. 상기 탐색을 통하여 대응하는 제2 참조 템플릿(1027)이 찾아지면, 상기 두번째 8x8 서브 블록(1012)의 BV 값을 보상 교정한 결과에 따라(1032) 제2 서브 참조 블록(1022)을 찾을 수 있다.As illustrated in FIG. 10, the template most similar to the first sub-template (1016) of the first 8x8 sub-block (1011) can be searched for through the TM process. If the corresponding first reference template (1026) is found through the search, the first sub-reference block (1021) can be found according to the result of compensating and correcting the BV value of the first 8x8 sub-block (1011) (1031). Similarly, the template most similar to the second sub-template (1017) of the second 8x8 sub-block (1012) can be searched for through the TM process. If the corresponding second reference template (1027) is found through the search, the second sub-reference block (1022) can be found according to the result of compensating and correcting the BV value of the second 8x8 sub-block (1012) (1032).

이 때, 상기 제2 서브 템플릿(1021)에 포함되는 영역 중 좌측은 상기 첫번째 8x8 서브 블록(1011)의 예측 값(제1 서브 참조 블록(1021) 및/또는 상기 제1 참조 템플릿(1026)에 기반할 수 있다)으로부터 도출된 값 중 우측 영역 정보를 사용할 수 있다. 즉, 본 발명의 실시예에 따라서, 분할된 서브 블록의 BV 값을 순차적으로 재정의하는 과정에 있어서, 같은 블록에 속하는 선행 서브 블록(예를 들어, 대상 블록(1010)에 있어서 제2 서브 블록(1012)보다 선행하는 제1 서브 블록(1011))에 대하여 복원된 값은 사용하지 아니하도록 구성될 수 있다. 상술한 바와 같이 대상 블록(1010)에 속하는 8개의 8x8 서브 블록에 대해서 동일 과정을 반복하여, 각각의 서브 블록에 대한 BV 값을 보상 교정할 수 있다. 실시예에 따르면, 상기 대상 블록(1010)의 초기 BV 값(1030)은 비트열 구문을 통하여 독립적으로 전송되거나, 또는 다른 정보로부터 유도되어진 값일 수 있다.At this time, among the areas included in the second sub-template (1021), the left side may use the right area information derived from the predicted value of the first 8x8 sub-block (1011) (which may be based on the first sub-reference block (1021) and/or the first reference template (1026). That is, according to an embodiment of the present invention, in the process of sequentially redefining the BV value of the divided sub-block, it may be configured not to use the restored value for the preceding sub-block belonging to the same block (for example, the first sub-block (1011) preceding the second sub-block (1012) in the target block (1010). As described above, the same process may be repeated for eight 8x8 sub-blocks belonging to the target block (1010), so that the BV value for each sub-block may be compensated and corrected. According to an embodiment, the initial BV value (1030) of the target block (1010) may be transmitted independently via a bit string syntax, or may be a value derived from other information.

상술한 본 발명의 일 실시예에 의하면, 예측 벡터(BV)의 재정의 시에 대상 블록 단위가 아닌 서브 블록 별로 재정의를 실시함으로써, 예측 벡터(BV)의 정교함을 높이면서, TM을 사용하여 재정의 시에 대상 블록의 예측 벡터 정밀도보다 더 상세한 정밀도에 의한 보상 교정을 통하여 상기 예측 벡터의 정교함을 추가로 향상할 수 있다. 일 실시예에 따르면, 대상 블록의 예측 벡터(BV) 값을 1화소 단위 정밀도로 전송한 후, 서브 블록 별로 BV 값을 재정의할 때 1/4화소 단위에서 보상 교정하도록 구성될 수 있다. 다른 일 실시예에 따르면, 대상 블록 및 서브 블록의 예측 벡터(BV)를 동일한 정밀도에 의하여 보상 교정하는 것도 가능하다. 가령, 대상 블록의 벡터 값을 1화소 단위 정밀도로 전송한 후, 서브 블록의 벡터 값을 다시 1화소 단위 정밀도로 보상 교정하여도 무방하다.According to one embodiment of the present invention described above, when redefining a prediction vector (BV), redefinition is performed for each sub-block rather than for each target block, thereby increasing the precision of the prediction vector (BV), and further improving the precision of the prediction vector through compensation correction with a precision more detailed than the precision of the prediction vector of the target block when redefining using TM. According to one embodiment, after transmitting the prediction vector (BV) value of the target block with a precision of 1 pixel unit, compensation correction can be performed in units of 1/4 pixel when redefining the BV value for each sub-block. According to another embodiment, it is also possible to compensate-correct the prediction vector (BV) of the target block and the sub-block with the same precision. For example, after transmitting the vector value of the target block with a precision of 1 pixel unit, compensation correction can be performed again with a precision of 1 pixel unit for the vector value of the sub-block.

본 발명의 실시예는 상술한 바와 같이 대상 블록의 벡터 값을 정수 화소로 산정 및/또는 전송하고 이를 서브 블록마다 부화소 단위로 재정의하는 것에 한정되지 아니하며, 그 밖의 정밀도 단위나 방법에 의하여 대상 블록에 대하여 산정/전송되는 시점의 예측 벡터 정밀도와 같거나 그보다 더 정교한 정밀도 단위로 서브 블록 각각에 대하여 벡터를 보상 교정(compensation)하는 일체의 방법을 포함하는 것으로 이해되어야 할 것이다. 예를 들어, 1/2화소 단위로 대상 블록의 BV 값을 전송하고 1/4화소 단위로 서브 블록의 BV 값을 보상 교정하는 방법 또한 본 발명의 일 실시예에 포함되는 것으로 보아야 할 것이다.The embodiment of the present invention is not limited to calculating and/or transmitting the vector value of the target block as an integer pixel as described above and redefining it in sub-pixel units for each sub-block, and it should be understood that it includes any method for compensating the vector for each sub-block with a precision unit that is equal to or more precise than the prediction vector precision at the time of calculating/transmitting for the target block by other precision units or methods. For example, a method of transmitting the BV value of the target block in units of 1/2 pixels and compensating for the BV value of the sub-block in units of 1/4 pixels should also be considered to be included in one embodiment of the present invention.

상기 실시예를 구현하는 데 있어서 도 9에 나타난 흐름도가 동일 또는 유사하게 준용될 수 있다. 흐름도 상의 처리 과정에 대한 상세한 설명은 앞서 설명된 바에 준하여 설명될 수 있을 것이다. 단, 벡터의 재정의 여부 및 실행과 관련된 각단계는 서브 블록에 대한 재정의 여부 및 실행에 관련된 것으로 대체하여 해석될 수 있다. 상술된 실시예에서와 마찬가지로, 본 발명의 다른 일 실시예에 따르면, 상기 "BV_refinement_enabled_flag"를 판단하는 과정(S920)을 생략하고, 서브 블록에 대한 예측 벡터 재정의를 수행할지를 나타내는 "BV_refinement_flag" 값의 파싱(S930)이 이루어질 수 있다. 본 발명의 또다른 일 실시예에 따르면, 상기 "BV_refinement_enabled_flag"를 판단하는 과정(S920) 내지 상기 "BV_refinement_flag"를 판단하는 과정(S940)에 이르는 일련의 과정을 생략하여, 서브 블록에 대한 예측 벡터의 재정의 수행 여부를 묻지 않고 바로 TM 과정(S950)으로 이행하여 상기 서브 블록 별 예측 벡터 값을 재정의(S960)하도록 구성될 수 있다.In implementing the above embodiment, the flowchart shown in Fig. 9 may be applied identically or similarly. A detailed description of the processing steps in the flowchart may be described in accordance with the above description. However, each step related to whether or not to redefine and execute a vector may be interpreted as being related to whether or not to redefine and execute a sub-block. As in the above-described embodiment, according to another embodiment of the present invention, the process of determining the "BV_refinement_enabled_flag" (S920) may be omitted, and parsing of the "BV_refinement_flag" value indicating whether to perform prediction vector redefinition for the sub-block (S930) may be performed. According to another embodiment of the present invention, a series of processes from the process of determining the "BV_refinement_enabled_flag" (S920) to the process of determining the "BV_refinement_flag" (S940) may be omitted, and the process may be configured to proceed directly to the TM process (S950) without asking whether to redefine the prediction vector for the sub-block, and redefine the prediction vector value for each sub-block (S960).

제4실시예Fourth embodiment

본 발명의 일 실시예에 따르면, 단방향 예측(uni-predictive) 및/또는 양방향 예측(bi-prediction)에 의하여 정의되는 예측 벡터, 예를 들어 BV를 사용하는 예측 방법을 제안한다. 상기 양방향 예측이란, 현재 픽쳐 내에서 두 개의 서로 다른 참조 블록(또는 예측 블록)을 유도하고, 상기 두 개의 참조 블록을 가중합산(weighted sum) 또는 가중평균(weighted average)을 포함하는 방법에 의하여 예측 융합(prediction fusion)하여 최종적으로 하나의 예측 블록을 생성하는 방법을 의미할 수 있다.According to one embodiment of the present invention, a prediction method using a prediction vector, e.g., a BV, defined by uni-predictive and/or bi-prediction is proposed. The bi-prediction may refer to a method of deriving two different reference blocks (or prediction blocks) within a current picture, and performing prediction fusion on the two reference blocks using a method including a weighted sum or a weighted average to ultimately generate a single prediction block.

도 11은 본 발명의 일 실시예에 따른 양방향 예측에 의한 예측 방법을 나타내는 개념도이다. 도 11을 참조하면, 대상 블록(1110)에 대하여 두 개의 예측 벡터(1131, 1132)를 사용함으로써 두 개의 참조 블록(1121, 1122)을 유도한 후, 최종적으로 상기 두 개의 참조 블록(1121, 1122)으로부터 유도되는 예측 블록 값에 대하여 가중치에 기반한 연산을 통하여(예를 들어, 가중 합산 또는 가중 평균을 통하여) 최종 참조 블록을 획득할 수 있다. 다양한 실시예에 따라서, 상기 가중치는 상호 대등하도록 5:5로 정의되거나, 별도의 비트열 구문 정보를 통하여 전송될 수 있다. 예를 들어, 구문 정보에 포함되는 플래그 값 또는 인덱스 값과 같은 표현 방법을 통하여 4:6 또는 3:7과 같은 가중치의 분배 값을 전송할 수 있다. 또한 상기 가중치는 비트열에 의하여 독립적으로 전송되는 대신 상기 비트열로부터 복호화 과정에서 도출되는 다른 정보에 의하여 유도될 수도 있다. 상기 유도는 상기 가중치의 형태로 직접 이루어지거나, 또는 상술된 플래그 값 또는 인덱스 값에 상응하는 형태로 이루어질 수도 있다.FIG. 11 is a conceptual diagram illustrating a prediction method using bidirectional prediction according to an embodiment of the present invention. Referring to FIG. 11, after two reference blocks (1121, 1122) are derived by using two prediction vectors (1131, 1132) for a target block (1110), a final reference block can be obtained through a weight-based operation (e.g., weighted sum or weighted average) on the predicted block values derived from the two reference blocks (1121, 1122). According to various embodiments, the weights may be defined as 5:5 to be mutually equal, or may be transmitted through separate bitstream syntax information. For example, a weight distribution value such as 4:6 or 3:7 may be transmitted through an expression method such as a flag value or index value included in the syntax information. In addition, the weights may be derived from other information derived from the bitstream during the decoding process, instead of being transmitted independently by the bitstream. The above derivation may be made directly in the form of the above weights, or in a form corresponding to the above-described flag values or index values.

상술한 바와 같이 두 개의 참조 블록(1121, 1122)을 유도하는 과정은, 각 참조 블록에 대한 예측 벡터, 가령 BV 값(1131, 1132)을 스킵/머지 모드, MBVD 모드, BVP 모드를 포함하는 다양한 모드 중 하나의 방법에 의하여 구함으로써 이루어질 수 있다. 일례를 들어, 제1 참조 블록(1121)에 대한 BV(1131)는 스킵/머지 모드에 의하여 선정 및 전송하고, 제2 참조 블록(1122)에 대한 BV(1132)는 MBVD 모드에 의하여 선정 및 전송할 수 있다. 다른 일례를 들어, 제1 참조 블록(1121)에 대한 BV(1131)는 머지 모드에 의하여 선정 및 전송하고, 제2 참조 블록(1122)에 대한 BV(1132)는 BVP 모드에 의하여 선정 및 전송할 수 있다. 또다른 일례를 들어, 상기 각각의 모드에 따라 두 참조 블록(1121, 1122)에 대한 가중치를 유도할 수도 있다. 가령, BVP 모드를 사용하는 참조 블록에 더 높은 가중치(예를 들어, 7)를 할당하고, 머지 모드에 대한 참조 블록에 더 낮은 가중치(예를 들어, 3)을 할당하도록 구성될 수 있다. 상술된 실시예를 참조하여, 모드와 가중치에 관련하여 상술된 예시에 한정되지 않는 그 밖의 다양한 응용이 가능함을 알 수 있다.As described above, the process of deriving two reference blocks (1121, 1122) can be accomplished by obtaining a prediction vector for each reference block, for example, a BV value (1131, 1132), by one of various modes including a skip/merge mode, an MBVD mode, and a BVP mode. For example, the BV (1131) for the first reference block (1121) can be selected and transmitted by the skip/merge mode, and the BV (1132) for the second reference block (1122) can be selected and transmitted by the MBVD mode. For another example, the BV (1131) for the first reference block (1121) can be selected and transmitted by the merge mode, and the BV (1132) for the second reference block (1122) can be selected and transmitted by the BVP mode. As another example, weights for the two reference blocks (1121, 1122) may be derived according to each of the above modes. For example, a higher weight (e.g., 7) may be assigned to a reference block using the BVP mode, and a lower weight (e.g., 3) may be assigned to a reference block using the merge mode. With reference to the above-described embodiments, it should be understood that various other applications are possible, not limited to the examples described above, regarding modes and weights.

본 발명의 일 실시예에 따르면, 상기 두 개의 참조 블록(1121, 1122)이 동일한 모드로 예측 벡터, 가령 BV 값을 선정 및 전송할 수 있다. 예를 들어, 제1 참조 블록(1121)에 대한 BV(1131)는 머지 모드에 의하여 선정 및 전송하고, 제2 참조 블록(1121)에 대한 BV(1131)도 머지 모드에 의하여 선정 및 전송할 수 있다. 실시예에 따라서, 전송 순서에 따라 상기 두 참조 블록(1121, 1122)을 병합하는 과정에 필요한 가중치를 유도할 수 있다. 예를 들어, 먼저 전송된 벡터 값에 더 높은 가중치(예를 들어, 7)를, 나중에 전송된 벡터 값에 더 낮은 가중치(예를 들어, 3)를 할당할 수 있다. 반대로, 먼저 전송된 벡터 값에 더 낮은 가중치(예를 들어, 3)를, 나중에 전송된 벡터 값에 더 높은 가중치(예를 들어, 7)를 할당할 수 있다. 상술된 실시예를 참조하여, 전송 순서에 의존적으로 결정되는 가중치에 관련하여 상술된 예시에 한정되지 않는 그 밖의 다양한 응용이 가능함을 알 수 있다. 예를 들어, 전송되는 순서에 의존적으로 미리 정해진 임의의 가중치를 할당할 수 있는 것이다.According to one embodiment of the present invention, the two reference blocks (1121, 1122) can select and transmit prediction vectors, e.g., BV values, in the same mode. For example, the BV (1131) for the first reference block (1121) can be selected and transmitted by merge mode, and the BV (1131) for the second reference block (1121) can also be selected and transmitted by merge mode. According to an embodiment, the weights required for the process of merging the two reference blocks (1121, 1122) can be derived according to the transmission order. For example, a higher weight (e.g., 7) can be assigned to a vector value transmitted first, and a lower weight (e.g., 3) can be assigned to a vector value transmitted later. Conversely, a lower weight (e.g., 3) can be assigned to a vector value transmitted first, and a higher weight (e.g., 7) can be assigned to a vector value transmitted later. With reference to the above-described embodiments, it can be seen that various other applications are possible, not limited to the examples described above, regarding weights determined dependent on the transmission order. For example, a predetermined arbitrary weight can be assigned dependent on the transmission order.

이하 IBC 모드에서 두 개의 참조 블록을 사용하여 양방향 예측을 하는 경우, 상기 두개의 참조 블록에 대한 가중치를 결정하기 위한 추가적인 실시예들에 대하여 설명한다.Hereinafter, additional embodiments for determining weights for two reference blocks when performing bidirectional prediction using two reference blocks in IBC mode are described.

본 발명의 일 실시예에 따르면, 대상 블록에 대하여 두 개의 참조 블록을 사용할지 여부를 나타내는 비트열 구문 정보, 예를 들어 "two_ref_block_flag"를 파싱할 수 있다. 이 때, 상기 "two_ref_block_flag" 값이 참으로 판단되면("Yes", "Y", "True", "1" 등을 포함하는 방법으로 나타날 수 있다) 두 개의 예측 벡터(BV) 값에 대한 비트열 구문을 파싱하고, 상기 "two_ref_block_flag 값이 거짓으로 판단되면("No", "N", "False", "0" 등을 포함하는 방법으로 나타날 수 있다), 하나의 예측 벡터(BV) 값에 대한 비트열 구문을 파싱하도록 구성될 수 있다. 상기 두 개의 참조 블록에 대한 모드는 본 발명에서 상술한 바 있는 스킵/머지 모드, MBVD 모드, 및 BPV 모드를 포함하는 다양한 방식 중 하나를 채택하여 전송할 수 있다. According to one embodiment of the present invention, bit string syntax information, for example, "two_ref_block_flag", indicating whether to use two reference blocks for a target block may be parsed. At this time, if the "two_ref_block_flag" value is determined to be true (which may be represented in a manner including "Yes", "Y", "True", "1", etc.), the bit string syntax for two prediction vector (BV) values may be parsed, and if the "two_ref_block_flag value is determined to be false (which may be represented in a manner including "No", "N", "False", "0", etc.), the bit string syntax for one prediction vector (BV) value may be parsed. The mode for the two reference blocks may be transmitted by adopting one of various methods including the skip/merge mode, the MBVD mode, and the BPV mode described above in the present invention.

본 발명의 일 실시예에 따르면, 두 개의 참조 블록을 사용할지 여부를 나타내는 비트열 구문 정보(예를 들어, 상기 "two_ref_block_flag")의 사용 여부를 나타내는 비트열 구문 정보, 예를 들어 "two_ref_block_enabled_flag" 값을 두어, 상기 "two_ref_block_enabled_flag" 값이 참으로 판단되면("Yes", "Y", "True", "1" 등을 포함하는 방법으로 나타날 수 있다), 상기 "two_ref_block_flag"를 파싱하도록 구성될 수 있다. 이 때, 상기 "two_ref_block_enabled_flag"는 고수준 비트열 구문(high-level syntax; HLS)으로써, SPS(sequence parameter set), PPS(picture parameter set), PH(picture header), SH(slice header)를 포함하는 헤더 영역 중 적어도 하나의 위치에서 획득될 수 있다.According to one embodiment of the present invention, bit string syntax information indicating whether to use two reference blocks (e.g., the "two_ref_block_flag"), for example, a "two_ref_block_enabled_flag" value, may be set so that if the "two_ref_block_enabled_flag" value is determined to be true (which may be expressed in a manner including "Yes", "Y", "True", "1", etc.), the "two_ref_block_flag" may be parsed. At this time, the "two_ref_block_enabled_flag" may be obtained from at least one location among a header area including a sequence parameter set (SPS), a picture parameter set (PPS), a picture header (PH), and a slice header (SH) as a high-level syntax (HLS).

본 발명의 일 실시예에 따르면, 대상 블록에 대한 방향성(direction) 플래그 정보가 전송될 수 있다. 도 12는 본 발명의 일 실시예에 따른 방향성 플래그 정보의 처리에 대한 흐름도이다. 도 12를 참조하면, 방향성 플래그 정보(예를 들어, "direction_flag")가 파싱될 수 있다(S1210). 이어서, 상기 방향성 플래그 정보가 단방향 예측(uni-predictive) 또는 양방향 예측(bi-predictive) 중 어느 하나의 적용을 의미하는 정보임이 판단(S1220)될 수 있다. 상기 방향성 플래그 정보가 단방향 예측(uni-predictive)을 나타내는 경우, 1개의 BV 정보를 전송(S1230)하고, 양방향 예측(bi-predictive)을 나타내는 경우, 2개의 BV 정보를 전송(S1235)할 수 있다.According to one embodiment of the present invention, direction flag information for a target block may be transmitted. Fig. 12 is a flowchart for processing direction flag information according to one embodiment of the present invention. Referring to Fig. 12, direction flag information (e.g., "direction_flag") may be parsed (S1210). Next, it may be determined (S1220) whether the direction flag information indicates application of either uni-predictive or bi-predictive. If the direction flag information indicates uni-predictive, one BV information may be transmitted (S1230), and if it indicates bi-predictive, two BV information may be transmitted (S1235).

본 발명의 일 실시예에 따르면, 스킵/머지 모드 및 MBVD 모드에서는 최종 선정된 후보 인덱스(candidate index) 값에 따라 대상 블록의 방향성 플래그 정보가 결정되고, BVP 모드인 경우에는 "direction_flag"와 같이 비트열 구문 정보에 따라 정해질 수 있다. 상기 스킵/머지 모드 및 MBVD 모드에서, 상기 최종 선정된 후보 인덱스 값이 가리키는 예측 벡터(BV) 정보는 단방향 예측(uni-predictive)을 나타낼 수도 있고, 또는 양방향 예측(bi-predictive)을 나타낼 수도 있다. 한편, BVP 모드에서 "direction_flag" 구문 정보 값이 양방향 예측으로 판단되면, 2개의 예측 벡터(BV) 값에 대한 정보를 파싱할 수 있다. 상기 예측 벡터(BV) 정보에는 "bvd", "bvp_idx"(또는 "bvp_flag"), "abvr_idx", "ref_idx" 중 적어도 하나의 구문 정보가 포함될 수 있다.According to one embodiment of the present invention, in skip/merge mode and MBVD mode, directionality flag information of a target block is determined according to a final selected candidate index value, and in BVP mode, it may be determined according to bit string syntax information such as "direction_flag". In the skip/merge mode and MBVD mode, the prediction vector (BV) information indicated by the final selected candidate index value may indicate uni-predictive or bi-predictive. Meanwhile, if the "direction_flag" syntax information value is determined to be bi-predictive in the BVP mode, information on two prediction vector (BV) values may be parsed. The prediction vector (BV) information may include syntax information of at least one of "bvd", "bvp_idx" (or "bvp_flag"), "abvr_idx", and "ref_idx".

이 경우에도, 양방향 예측이 적용되는 경우, 두 개의 참조 블록(1121, 1122)을 유도한 후, 최종적으로 상기 두 개의 참조 블록(1121, 1122)으로부터 유도되는 예측 블록 값에 대하여 가중치에 기반한 연산을 통하여(예를 들어, 가중 합산 또는 가중 평균을 통하여) 최종 참조 블록을 획득할 수 있다. 다양한 실시예에 따라서, 상기 가중치는 상호 대등하도록 5:5로 정의되거나, 별도의 비트열 구문 정보를 통하여 전송될 수 있다. 예를 들어, 구문 정보에 포함되는 플래그 값 또는 인덱스 값과 같은 표현 방법을 통하여 4:6 또는 3:7과 같은 가중치의 분배 값을 전송할 수 있다. 또한 상기 가중치는 비트열에 의하여 독립적으로 전송되는 대신 상기 비트열로부터 복호화 과정에서 도출되는 다른 정보에 의하여 유도될 수도 있다. 상기 유도는 상기 가중치의 형태로 직접 이루어지거나, 또는 상술된 플래그 값 또는 인덱스 값에 상응하는 형태로 이루어질 수도 있다.Even in this case, when bidirectional prediction is applied, after deriving two reference blocks (1121, 1122), the final reference block can be obtained through a weight-based operation (e.g., through a weighted sum or a weighted average) on the predicted block values derived from the two reference blocks (1121, 1122). According to various embodiments, the weights may be defined as 5:5 to be mutually equal, or may be transmitted through separate bitstream syntax information. For example, a weight distribution value such as 4:6 or 3:7 may be transmitted through an expression method such as a flag value or an index value included in the syntax information. In addition, the weights may be derived from other information derived from the bitstream during the decoding process instead of being independently transmitted by the bitstream. The derivation may be performed directly in the form of the weights, or in a form corresponding to the flag value or index value described above.

상기 예측 융합은 상술한 두 개의 참조 블록에 기반한 방법으로부터 응용될 수 있는 다양한 방법으로 이루어질 수 있다. 예를 들어, 세 개 이상의 참조 블록을 적합한 가중치에 의하여 사용할 수 있고, 또한 상기 예측 융합에 포함되는 참조 블록 중 적어도 하나가 예측 벡터에 기반하지 아니한 방법, 예를 들어 DC 예측이나 평면(planar) 예측을 포함하는 방법에 의하여 도출되더라도 무방하다.The above prediction fusion can be achieved in various ways that can be applied from the method based on the two reference blocks described above. For example, three or more reference blocks can be used with appropriate weights, and at least one of the reference blocks included in the prediction fusion can be derived by a method that is not based on a prediction vector, such as a method that includes DC prediction or planar prediction.

제5실시예Fifth embodiment

본 발명의 일 실시예에 따르면, 확장된 양방향 예측 벡터를 사용하는 방법이 사용될 수 있다. 상기 확장된 양방향 예측 벡터에 의하면, 화면 내 예측 및 화면 간 예측을 동시에 수행할 수 있다. 예를 들어, 현재 픽쳐 내에서 화면 내 예측 벡터인 BV가 가리키는 참조 블록을 유도하고, 참조 픽쳐 내에서 화면 간 예측 벡터인 MV가 가리키는 참조 블록을 유도할 수 있다. 상술한 바와 같이 두 개의 서로 다른 참조 블록을 유도한 후, 상기 두 개의 참조 블록을 가중합산(weighted sum) 또는 가중평균(weighted average)을 포함하는 방법에 의하여 예측 융합(prediction fusion)하여 최종적으로 하나의 예측 블록을 생성하는 방법을 의미할 수 있다.According to one embodiment of the present invention, a method using an extended bidirectional prediction vector can be used. Using the extended bidirectional prediction vector, intra-screen prediction and inter-screen prediction can be performed simultaneously. For example, a reference block pointed to by a BV, which is an intra-screen prediction vector, within a current picture can be derived, and a reference block pointed to by a MV, which is an inter-screen prediction vector, within a reference picture can be derived. As described above, after deriving two different reference blocks, the two reference blocks are predicted and fused using a method including a weighted sum or a weighted average to ultimately generate a single prediction block.

도 13은 본 발명의 일 실시예에 따른 확장된 양방향 예측에 의한 예측 방법을 나타내는 개념도이다. 도 13을 참조하면, 대상 블록(1310)에 대하여 블록 벡터(BV)(1330)를 사용하여 현재 픽쳐 내의 복호화된 영역(1390)으로부터 화면 내 예측에 따른 제1 참조 블록(1320)을 유도하고, 움직임 벡터(MV)(1350)를 사용하여 참조 픽쳐 내에서 화면 간 예측에 따른 제2 참조 블록(1340)을 유도한 후, 최종적으로 상기 두 개의 참조 블록(1320, 1340)으로부터 유도되는 예측 블록 값에 대하여 가중치에 기반한 연산을 통하여(예를 들어, 가중 합산 또는 가중 평균을 통하여) 최종 참조 블록을 획득할 수 있다. 다양한 실시예에 따라서, 상기 가중치는 상호 대등하도록 5:5로 정의되거나, 별도의 비트열 구문 정보를 통하여 전송될 수 있다. 예를 들어, 구문 정보에 포함되는 플래그 값 또는 인덱스 값과 같은 표현 방법을 통하여 4:6 또는 3:7과 같은 가중치의 분배 값을 전송할 수 있다. 또한 상기 가중치는 비트열에 의하여 독립적으로 전송되는 대신 상기 비트열로부터 복호화 과정에서 도출되는 다른 정보에 의하여 유도될 수도 있다. 상기 유도는 상기 가중치의 형태로 직접 이루어지거나, 또는 상술된 플래그 값 또는 인덱스 값에 상응하는 형태로 이루어질 수도 있다.Fig. 13 is a conceptual diagram illustrating a prediction method using extended bidirectional prediction according to one embodiment of the present invention. Referring to Fig. 13, a first reference block (1320) is derived from an intra-screen prediction from a decoded area (1390) within a current picture using a block vector (BV) (1330) for a target block (1310), and a second reference block (1340) is derived from an inter-screen prediction within a reference picture using a motion vector (MV) (1350). Finally, a final reference block can be obtained through a weight-based operation (e.g., through a weighted sum or a weighted average) on the predicted block values derived from the two reference blocks (1320, 1340). According to various embodiments, the weights may be defined as 5:5 to be mutually equal, or may be transmitted through separate bitstream syntax information. For example, a weight distribution value such as 4:6 or 3:7 can be transmitted through a representation method such as a flag value or index value included in the syntax information. In addition, the weight may be derived from other information derived from the bit string during the decoding process instead of being transmitted independently by the bit string. The derivation may be performed directly in the form of the weight value, or in a form corresponding to the flag value or index value described above.

상술한 바와 같이, 화면 내 예측에 의한 제1 참조 블록(1320)은 BV(1330)에 기반하여, 화면 간 예측에 의한 제2 참조 블록(1340)은 MV(1350)에 의하여 도출될 수 있다. 이 때, 상기 BV 정보는 IBC 스킵/머지 모드, IBC MBVD 모드, IBC BVP 모드를 포함하는 다양한 화면 내 예측 모드 중 하나의 방법에 의하여 구할 수 있다. 또한, 상기 MV 정보는 스킵/머지 모드, MMVD 모드, AMVP 모드를 포함하는 다양한 화면 간 예측 모드 중 하나의 방법에 의하여 구할 수 있다. 다양한 실시예에 따라서, 상기 가중치는 상호 대등하도록 5:5로 정의되거나, 또는 예측 모드에 따라서 상이하도록 정의될 수 있다. 예를 들어, 화면 간 예측에 기반한 참조 블록에 더 높은 가중치(예를 들어, 7)를, 화면 내 예측에 기반한 참조 블록에 더 낮은 가중치(예를 들어, 3)를 할당할 수 있다. 반대로, 화면 간 예측에 기반한 참조 블록에 더 낮은 가중치(예를 들어, 3)를, 화면 내 예측에 기반한 참조 블록에 더 높은 가중치(예를 들어, 7)를 할당할 수 있다. 상술된 실시예를 참조하여, 예측 모드에 의존적으로 결정되는 가중치에 관련하여 상술된 예시에 한정되지 않는 그 밖의 다양한 응용이 가능함을 알 수 있다. 즉, 예측 모드에 의존적으로 미리 정해진 임의의 가중치를 할당할 수 있는 것이다.As described above, the first reference block (1320) by intra-screen prediction can be derived based on the BV (1330), and the second reference block (1340) by inter-screen prediction can be derived based on the MV (1350). At this time, the BV information can be obtained by one of various intra-screen prediction modes including the IBC skip/merge mode, the IBC MBVD mode, and the IBC BVP mode. In addition, the MV information can be obtained by one of various inter-screen prediction modes including the skip/merge mode, the MMVD mode, and the AMVP mode. According to various embodiments, the weights can be defined as 5:5 to be mutually equal, or can be defined differently according to the prediction mode. For example, a higher weight (e.g., 7) can be assigned to a reference block based on inter-screen prediction, and a lower weight (e.g., 3) can be assigned to a reference block based on intra-screen prediction. Conversely, a lower weight (e.g., 3) can be assigned to a reference block based on inter-screen prediction, and a higher weight (e.g., 7) can be assigned to a reference block based on intra-screen prediction. With reference to the above-described embodiments, it can be seen that various other applications are possible, not limited to the examples described above, with respect to weights determined depending on the prediction mode. That is, any predetermined weight can be assigned depending on the prediction mode.

본 발명의 일 실시예에 따르면, 화면 내 예측(BV) 및 화면 간 예측(MV)을 함께 수행하는 방법을 별도의 예측 모드로 구분하여, 해당 예측 모드를 나타내는 전용의 플래그/인덱스 값을 비트열 구문 정보(예를 들어, "combined_bv_mv_pred_flag")로 정의하여 사용할 수도 있다.According to one embodiment of the present invention, a method of performing intra-screen prediction (BV) and inter-screen prediction (MV) together may be distinguished as separate prediction modes, and a dedicated flag/index value indicating the prediction mode may be defined as bit string syntax information (e.g., "combined_bv_mv_pred_flag") and used.

예를 들어, 상기 "combined_bv_mv_pred_flag" 값이 참으로 판단되면("Yes", "Y", "True", "1" 등을 포함하는 방법으로 나타날 수 있다), BV 값에 대한 구문 정보 및 MV 값에 대한 구문 정보를 각각의 모드에 맞추어 파싱하도록 구성될 수 있다. 이 때, 상기 BV 값을 표현하는 모드는 IBC 스킵/머지 모드, IBC MBVD 모드, IBC BVP 모드를 포함하는 다양한 화면 내 예측 모드 중 묵시적으로 미리 정해진 하나의 모드를 사용하도록 구성될 수도 있고, 또는 별도의 구문 정보를 통하여 명시적으로 하나의 모드를 지정하도록 구성될 수도 있다. 상기 MV 값을 표현하는 모드는 스킵/머지 모드, MMVD 모드, AMVP 모드를 포함하는 다양한 화면 간 예측 모드 중 묵시적으로 미리 정해진 하나의 모드를 사용하도록 구성될 수도 있고, 또는 별도의 구문 정보를 통하여 명시적으로 하나의 모드를 지정하도록 구성될 수도 있다. 어떠한 경우이든, 상기 BV에 대한 모드 및 상기 MV에 대한 모드가 결정됨에 따라서, 상응하는 모드에 의하여 각 예측 벡터들의 구문 정보를 파싱할 수 있다.For example, if the "combined_bv_mv_pred_flag" value is determined to be true (which may be represented by a method including "Yes", "Y", "True", "1", etc.), the syntax information for the BV value and the syntax information for the MV value may be configured to be parsed according to each mode. At this time, the mode expressing the BV value may be configured to use one mode implicitly predetermined among various intra-screen prediction modes including IBC skip/merge mode, IBC MBVD mode, and IBC BVP mode, or may be configured to explicitly designate one mode through separate syntax information. The mode expressing the MV value may be configured to use one mode implicitly predetermined among various inter-screen prediction modes including skip/merge mode, MMVD mode, and AMVP mode, or may be configured to explicitly designate one mode through separate syntax information. In any case, as the mode for the BV and the mode for the MV are determined, the syntax information of each prediction vector can be parsed according to the corresponding mode.

본 발명의 일 실시예에 따르면, 화면 내 예측(BV 기반) 및 화면 간 예측(MV 기반)을 함께 수행할지 여부를 나타내는 비트열 구문 정보(예를 들어, 상기 "combined_bv_mv_pred_flag")의 사용 여부를 나타내는 비트열 구문 정보, 예를 들어 "combined_bv_mv_pred_enabled_flag" 값을 두어, 상기 "combined_bv_mv_pred_enabled_flag" 값이 참으로 판단되면("Yes", "Y", "True", "1" 등을 포함하는 방법으로 나타날 수 있다), 상기 "combined_bv_mv_pred_flag"를 파싱하도록 구성될 수 있다. 이 때, 상기 "combined_bv_mv_pred_enabled_flag"는 고수준 비트열 구문(high-level syntax; HLS)으로써, SPS(sequence parameter set), PPS(picture parameter set), PH(picture header), SH(slice header)를 포함하는 헤더 영역 중 적어도 하나의 위치에서 획득될 수 있다.According to one embodiment of the present invention, bit string syntax information (e.g., the "combined_bv_mv_pred_flag") indicating whether to perform intra-screen prediction (BV-based) and inter-screen prediction (MV-based) together, for example, a "combined_bv_mv_pred_enabled_flag" value, may be set to indicate whether to use the bit string syntax information (e.g., the "combined_bv_mv_pred_enabled_flag"), and if the "combined_bv_mv_pred_enabled_flag" value is determined to be true (which may be expressed in a manner including "Yes", "Y", "True", "1", etc.), the "combined_bv_mv_pred_flag" may be configured to be parsed. At this time, the "combined_bv_mv_pred_enabled_flag" is a high-level syntax (HLS) and can be obtained from at least one location among the header areas including the sequence parameter set (SPS), the picture parameter set (PPS), the picture header (PH), and the slice header (SH).

부호화기 및 복호화기Encoder and decoder

본 발명에 따른 부호화 방법은 부호화기와 복호화기에서 동일하게 적용될 수 있는 것임이 자명하다. 본 발명에 따른 부호화 방법은 먼저, 부호화기에서 화면 내 및/또는 화면 간 예측을 실시하기 위한 예측 블록의 샘플 값을 생성하는 방법 중 하나로 이용될 수 있다. 이러한 예측 블록과의 차분 값에 의하여 잔차 신호 정보가 부호화된 경우, 복호화기에서는 상응하는 및/또는 대칭적인 방법을 이용하여 동일한 예측 블록의 샘플 값을 생성하고, 이러한 예측 블록에 차분 값을 결합합으로써 복호화된 샘플들을 획득할 수 있게 된다. 추가적으로, 도 4에서 내부 복호화기(420) 및 이를 포함하는 코딩 루프를 통해 예시한 바와 같이, 이러한 복호화 과정은 부호화기 내부에서도 복호화기의 상태를 예측하기 위하여 동일하게 구현될 수 있는 것에 해당한다.It is obvious that the encoding method according to the present invention can be applied equally to an encoder and a decoder. The encoding method according to the present invention can be used as one of the methods for generating sample values of a prediction block for performing intra- and/or inter-picture prediction in an encoder. When residual signal information is encoded by a difference value with respect to such a prediction block, the decoder can obtain decoded samples by generating sample values of the same prediction block using a corresponding and/or symmetrical method and combining the difference value with such a prediction block. Additionally, as exemplified through the internal decoder (420) and the coding loop including the same in FIG. 4, this decoding process can be implemented in the same manner within the encoder to predict the state of the decoder.

상술된 본 발명에 의한 부호화 방법은 장치로서의 부호화기를 통해 구현될 수 있다. 상기 장치로서의 부호화기는 앞서 도 1 내지 6을 통해 예시한 종래의 부호화기 구조를 유지하거나 그로부터 소정의 변화를 적용한 형태로서 구현될 수 있으나, 그 구현의 형태가 반드시 예시된 바에 한정되지는 아니하고, 비디오 부호화기로서 기능할 수 있는 어떠한 형태의 부호화기 구조를 취하더라도 본 발명의 기술적 사상을 구현하는 한 본 발명에 의하여 성립한 부호화기라고 보아야 할 것이다.The encoding method according to the present invention described above can be implemented through an encoder as a device. The encoder as a device can be implemented in a form that maintains the conventional encoder structure exemplified through FIGS. 1 to 6 above or applies a predetermined change thereto. However, the form of implementation is not necessarily limited to that exemplified, and any form of encoder structure that can function as a video encoder should be considered an encoder established by the present invention as long as it implements the technical idea of the present invention.

또한, 상술된 본 발명에 의한 부호화 결과물의 복호화 방법은 장치로서의 복호화기를 통해 구현될 수 있다. 상기 장치로서의 복호화기는 앞서 도 1 내지 6을 통해 예시한 종래의 복호화기 구조를 유지하거나 그로부터 소정의 변화를 적용한 형태로서 구현될 수 있으나, 그 구현의 형태가 반드시 예시된 바에 한정되지는 아니하고, 비디오 복호화기로서 기능할 수 있는 어떠한 형태의 복호화기 구조를 취하더라도 본 발명의 기술적 사상을 구현하는 한 본 발명에 의하여 성립한 복호화기라고 보아야 할 것이다.In addition, the method for decoding the encoding result according to the present invention described above can be implemented through a decoder as a device. The decoder as a device can be implemented in a form that maintains the conventional decoder structure exemplified through FIGS. 1 to 6 above or applies a predetermined change thereto. However, the form of implementation is not necessarily limited to that exemplified, and any form of decoder structure that can function as a video decoder should be considered a decoder established according to the present invention as long as it implements the technical idea of the present invention.

통상의 기술자는 상술한 방법 및 장치에 의하여 부호화된 비트열을 상기 부호화의 방법과 대칭적인 및/또는 역순의 방법을 적용함으로써 복호화할 수 있음을 쉽게 이해할 수 있을 것이다. 일 실시예에 있어서, 상기 부호화된 비트열로부터 복호화를 위한 정보를 읽어들임에 있어서는, 상기 부호화된 비트열에 포함된 적어도 하나의 가변장 부호화(variable length coding)된 구문이 해석될 수 있으며, 또한 일 실시예에 있어서, 상기 가변장 부호화는 엔트로피 부호화(entropy coding) 방법에 의하여 이루어질 수 있다. 이러한 복호화 절차 구현의 기술적 상세 및 응용 방법은 상술한 부호화 절차로부터 용이하게 이해할 수 있을 것이다.A person skilled in the art will readily understand that a bit string encoded by the above-described method and device can be decoded by applying a method symmetrical and/or reverse to the encoding method. In one embodiment, when reading information for decoding from the encoded bit string, at least one variable length coded phrase included in the encoded bit string can be interpreted, and furthermore, in one embodiment, the variable length coding can be performed by an entropy coding method. The technical details and application method of implementing such a decoding procedure can be readily understood from the above-described encoding procedure.

본 명세서에 기재된 상기 부호화기 및/또는 복호화기는, 각각 프로세서와 메모리를 포함하는 장치로서, 바람직하게는 컴퓨팅(연산) 장치로서 구현될 수 있는 것에 해당할 수 있다. 본 명세서에 기재된, 상기 부호화기 및/또는 복호화기에 포함될 수 있는 상기 프로세서는, 프로세서, 컨트롤러, ALU(arithmetic logic unit), 디지털 신호 프로세서(digital signal processor), 마이크로컴퓨터, FPA(field programmable array), PLU(programmable logic unit), 마이크로프로세서, 또는 명령(instruction)을 실행하고 응답할 수 있는 다른 어떠한 장치와 같이, 하나 이상의 범용 컴퓨터 또는 특수 목적 컴퓨터를 의미할 수 있다. The encoder and/or decoder described herein may correspond to a device including a processor and a memory, each of which may be implemented as a computing device. The processor that may be included in the encoder and/or decoder described herein may mean one or more general-purpose computers or special-purpose computers, such as a processor, a controller, an arithmetic logic unit (ALU), a digital signal processor, a microcomputer, a field programmable array (FPA), a programmable logic unit (PLU), a microprocessor, or any other device capable of executing instructions and responding.

이해의 편의를 위하여 상기 프로세서가 단수로 표현되는 경우라 할지라도, 해당 기술분야에서 통상의 지식을 가진 자는, 상기 프로세서가 복수 개의 처리요소(processing element) 및/또는 복수 유형의 처리 요소를 포함할 수 있음을 알 수 있다. 예를 들어, 본 발명의 일 실시예에 따른 장치는 상기 프로세서로서 복수 개의 프로세서 또는 하나의 프로세서 및 하나의 컨트롤러를 포함할 수 있다. 또한, 상기 프로세서는 병렬 프로세서(parallel processor)나 멀티-코어 프로세서(multi-core processor)와 같이, 다양한 처리 구성(processing configuration)에 의하여 구현될 수 있다.Even if the processor is expressed singularly for ease of understanding, those skilled in the art will appreciate that the processor may include multiple processing elements and/or multiple types of processing elements. For example, a device according to one embodiment of the present invention may include multiple processors or one processor and one controller as the processor. Furthermore, the processor may be implemented using various processing configurations, such as a parallel processor or a multi-core processor.

상기 프로세서는, 운영 체제(OS) 및 상기 운영 체제 상에서 수행되는 하나 이상의 소프트웨어(software)를 수행하도록 구성될 수 있다. 또한, 상기 프로세서는 상기 소프트웨어의 실행에 응답하여, 데이터를 접근, 저장, 조작, 처리 및 생성할 수도 있다.The processor may be configured to execute an operating system (OS) and one or more software programs running on the operating system. Furthermore, the processor may access, store, manipulate, process, and generate data in response to the execution of the software.

상기 소프트웨어는, 컴퓨터 프로그램(computer program), 코드(code), 명령(instruction), 또는 이들 중 하나 이상의 조합을 포함하 수 있으며, 원하는 대로 동작하도록 상기 프로세서를 제어할 수 있으며, 독립적으로 또는 결합적으로(collectively) 상기 프로세서에 명령을 내리도록 구성될 수 있다. 상기 소프트웨어는, 상기 프로세서에 의하여 해석되거나 상기 프로세서에 명령 또는 데이터를 제공하기 위하여, 어떤 유형의 기계, 구성요소(component), 물리적 장치, 가상 장치(virtual equipment), 컴퓨터 저장 매체 또는 장치, 또는 전송되는 신호 파(signal wave)에 영구적으로, 또는 일시적으로 구체화(embody)될 수 있다. 상기 소프트웨어는 네트워크로 연결된 컴퓨터 시스템 상에 분산되어서, 분산된 방법으로 저장되거나 실행될 수도 있다.The software may include a computer program, code, instructions, or a combination of one or more of these, and may be configured to control the processor to perform a desired operation and to issue commands to the processor, either independently or collectively. The software may be permanently or temporarily embodied in any type of machine, component, physical device, virtual equipment, computer storage medium or device, or transmitted signal wave, for interpretation by the processor or for providing commands or data to the processor. The software may also be distributed over networked computer systems, and stored or executed in a distributed manner.

상기 소프트웨어는 또한, 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 상기 메모리에 기록되거나 저장될 수 있다. 상기 메모리는 컴퓨터 판독 가능 기록 매체일 수 있으며, 상기 컴퓨터 판독 가능 기록 매체에는 프로그램 명령, 데이터 파일, 데이터 구조 등이 단독으로 또는 조합되어 기록될 수도 있다. 상기 메모리에 저장되는 프로그램 명령은 본 발명의 실시예를 위하여 특별히 설계되고 구성된 명령 체계에 기반하거나, 또는 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 명령 체계, 예를 들어 어셈블리어(Assembly), C, C++, Java, Python 언어 등으로 예시되는 명령 체계를 따를 수도 있다. 상기 명령 체계 및 그에 의한 프로그램 명령은 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용하여 본 발명의 일 실시예에 따른 장치 및/또는 상기 프로세서에 의해서 실행될 수 있는 고급 언어 코드를 포함하는 것으로 이해되어야 한다.The software may also be implemented in the form of program commands that can be executed through various computer means and recorded or stored in the memory. The memory may be a computer-readable recording medium, and program commands, data files, data structures, etc. may be recorded singly or in combination in the computer-readable recording medium. The program commands stored in the memory may be based on a command system specifically designed and configured for the embodiment of the present invention, or may follow a command system known and available to those skilled in the art of computer software, such as a command system exemplified by the assembly language, C, C++, Java, Python, etc. It should be understood that the command system and the program commands therefrom include not only machine language codes such as those generated by a compiler, but also high-level language codes that can be executed by the device and/or the processor according to an embodiment of the present invention using an interpreter or the like.

본 명세서에 기재된 상기 메모리를 포함하여 본 발명의 일 실시예에 따른 장치를 구성하는 컴퓨터 판독 가능 기록 매체는, 프로세서 캐시(Cache), 램(RAM), 플래시 메모리와 같이 상기 프로세서가 동작하는 동안만 유지되는 일시적 또는 휘발성 기록 매체를 포함할 수 있고, 또는 하드 디스크, 플로피 디스크, 및 자기 테이프와 같은 자기체(magnetic media), CD-ROM, DVD와 같은 광기록 매체(optical media), 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 또는 정적 상태 메모리(solid state memory)와 같이 상대적으로 비휘발성적이거나 장기 기록이 가능한 기록 매체를 포함할 수 있고, 또는 하드웨어 상에 배치된 롬(ROM)과 같은 읽기 전용의 기록 매체를 포함할 수 있으며, 나아가 회로배선에 의한 하드-와이어드(hard-wired) 구조에 의하여 일련의 프로그램 명령과 등가의 동작을 수행하도록 구성된 하드웨어 그 자체 또한, 본 발명의 실시예를 구현하는 상기 동작을 수행하기 위한 각 단계가 상기 하드웨어 부품의 연결과 배치에 의하여 기록되어 있다고 볼 수 있으므로, 그 연결 및 배치방법이 곧 상기 메모리와 등가인 것으로 볼 수 있음은 통상의 기술자에게 자명하다.The computer-readable recording medium constituting the device according to one embodiment of the present invention, including the memory described in this specification, may include a temporary or volatile recording medium that is maintained only while the processor is operating, such as a processor cache, a RAM, a flash memory, or a relatively non-volatile or long-term recording medium, such as a magnetic media such as a hard disk, a floppy disk, and a magnetic tape, an optical media such as a CD-ROM, a DVD, a magneto-optical media such as a floptical disk, or a solid state memory, or may include a read-only recording medium, such as a ROM arranged on hardware, and further, the hardware itself configured to perform an operation equivalent to a series of program commands by a hard-wired structure by circuit wiring, and each step for performing the operation for implementing the embodiment of the present invention can be viewed as being recorded by the connection and arrangement of the hardware components, and thus the connection and arrangement method is the memory and It is obvious to a person skilled in the art that they can be considered equivalent.

상기 프로세서 및 상기 메모리에 대하여 상술한 실시예는 상호 배타적이지 않으며, 필요에 따라 선택되거나 결합되어 실시될 수 있다. 예를 들어, 하나의 하드웨어 장치는 본 발명 실시예의 동작을 수행하기 위해 하나 이상의 상기 소프트웨어로 구성된 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다. 또 다른 예를 들어, 본 명세서에 있어서, 어떠한 기능부에 할당된 동작의 전부 또는 일부는 본 발명의 일 실시예에 따른 장치에(바람직하게는, 상기 메모리의 범주에 속하는 어느 하나의 기록 매체에) 저장된 하나 이상의 상기 소프트웨어에 의하여 구현되어, 상기 프로세서에 의하여 실행되도록 구성될 수 있으며, 이러한 경우, 상기와 같은 기능부는 상기 프로세서에 "포함되는" 기능부로서 칭해질 수 있다.The embodiments described above with respect to the processor and the memory are not mutually exclusive and may be selected or combined and implemented as needed. For example, a single hardware device may be configured to operate as a module composed of one or more of the software to perform the operations of an embodiment of the present invention, and vice versa. As another example, in the present specification, all or part of the operations assigned to a certain functional unit may be implemented by one or more of the software stored in a device according to an embodiment of the present invention (preferably, in any one of the recording media belonging to the category of the memory) and configured to be executed by the processor. In such a case, such a functional unit may be referred to as a functional unit "included" in the processor.

이상 본 발명에 대하여 도면 및 실시예를 참조하여 설명하였으나, 이미 상술한 바와 같이 본 발명의 보호범위가 상기 제시된 도면 또는 실시예에 의해 한정되는 것을 의미하지는 않으며, 해당 기술 분야의 숙련된 당업자는 본 발명 특허의 청구범위에 기재된 본 발명의 사상 및 영역으로부터 벗어나지 않는 범위 내에서 본 발명을 다양하게 수정 및 변경시킬 수 있음을 이해할 수 있을 것이다.Although the present invention has been described with reference to drawings and embodiments, as already mentioned above, it does not mean that the scope of protection of the present invention is limited to the drawings or embodiments presented above, and it will be understood that a person skilled in the relevant technical field can modify and change the present invention in various ways within a scope that does not depart from the spirit and scope of the present invention described in the claims of the present invention patent.

Claims

In a method for decoding an encoded video bit stream,
A step of reading prediction mode information for prediction decoding of a current block of a picture currently being decoded from the above bit string;
A step of determining whether the above prediction mode information indicates a mode based on intra block copy (IBC);
A step of reading, from the bit string, at least one prediction vector specifying the location of at least one reference block from at least one picture including the picture currently being decoded, based on the prediction mode information;
A step of dividing the current block into two or more sub-blocks;
A step of determining whether the above prediction mode information indicates a mode that uses refinement of the prediction vector;
In the case of a mode using the above redefinition, a step of individually compensating and redefining the prediction vector for each sub-block based on the prediction vector;
A step of obtaining two or more sub-reference blocks individually for each of the sub-blocks based on the prediction vector, and combining the respective sub-reference blocks to obtain at least one reference block;
A step of generating a prediction block based on at least one reference block; and
A decoding method, comprising: a step of performing predictive decoding for the current block based on the predicted block.

In the first paragraph,
The step of reading the above prediction vector is:
A step of obtaining an adaptive block vector resolution (ABVR) flag value from the bit string; and
A step of reading precision representation information of a block vector when the ABVR flag value falls within the first range;
A step of determining the precision of the prediction vector based on the ABVR flag information and the precision expression information; and
A decoding method, comprising: a step of reading the prediction vector based on the above precision;

In the second paragraph,
The accuracy of the above prediction vector is
A decoding method characterized in that it is determined as one of at least one precision unit including a sub-pixel unit.

In the third paragraph,
The accuracy of the above prediction vector is
If the above ABVR flag value does not fall within the first range, in 1/4 pixel units,
A decoding method characterized in that, when the ABVR flag value corresponds to the first range, the value of the precision representation information is determined in units of 1 pixel when the value of the precision representation information is the first value, and in units of 4 pixels when the value of the precision representation information is the second value.

In any one of paragraphs 2 to 4,
The above first range is characterized in that it means a case where the ABVR flag value is “1”,
A decryption method, characterized in that the above precision expression information is an index value having at least two cases.

delete

In the first paragraph,
The step of redefining the above prediction vector is:
A decoding method characterized in that it operates based on a template matching method having a search range based on a point pointed to by the above prediction vector.

In the first paragraph,
The step of redefining the above prediction vector is:
A decoding method characterized by performing compensation correction with a precision higher than the precision of the above prediction vector.

In paragraph 9,
The step of redefining the above prediction vector is:
A decoding method characterized by compensating and correcting a prediction vector read in units of integer pixels in units of subpixels.

In the first paragraph,
further comprising a step of determining whether the above prediction mode information indicates a mode that allows two or more prediction vectors;
The step of reading the above prediction vector includes a step of reading a first prediction vector; and a step of reading a second prediction vector;
The step of obtaining the above reference block includes: a step of obtaining a first reference block based on the first prediction vector; and a step of obtaining a second reference block based on the second prediction vector;
A decoding method, characterized in that the step of generating the prediction block generates the prediction block through prediction fusion based on two or more reference blocks including the first reference block and the second reference block.

In paragraph 11,
The above first prediction vector and the above second prediction vector are block vectors for intra-screen prediction,
A decoding method, characterized in that the first prediction vector and the second prediction vector are configured to indicate the positions of different reference blocks within an already decoded area of the current picture.

In paragraph 11,
The above first prediction vector is a block vector for prediction within the screen and is configured to indicate the location of the first reference block within an already decoded area of the current picture,
A decoding method, characterized in that the second prediction vector is a motion vector for inter-screen prediction and is configured to indicate the position of the second reference block within a previously decoded picture.

In paragraph 11,
The above prediction fusion is performed by a weighted sum or weighted average performed by a combination of weights,
A decryption method, characterized in that information about the above weight is obtained from the bit string.

In a video encoding method for generating an encoded bit stream,
A step of dividing a current block of a picture currently being encoded into two or more sub-blocks;
For at least one sub-block, determining at least one prediction vector that specifies the location of at least one reference block within at least one picture including the picture currently being encoded;
A step of individually compensating and refining the prediction vector for each sub-block based on the prediction vector;
A step of obtaining two or more sub-reference blocks individually obtained for each sub-block based on the prediction vector, and combining the respective sub-reference blocks to obtain at least one reference block;
A step of generating a prediction block based on at least one reference block;
A step of performing prediction encoding for the current block based on the prediction block;
A step of determining a prediction mode for the current block based on the result of the above prediction encoding as a mode based on intra block copy (IBC);
A step of determining whether the prediction mode for the current block is a mode that uses redefinition of the prediction vector based on the result of the above prediction encoding;
A step of generating a bit string syntax representing the prediction mode and the at least one prediction vector based on the prediction encoding mode; and
An encoding method, comprising: a step of recording the bit string syntax into the bit string.

In paragraph 15,
The above prediction vector is determined by at least one precision including a sub-pixel unit,
An encoding method, wherein the step of generating the bit string syntax includes: a step of generating a bit string syntax representing a unit of precision based on an adaptive block vector resolution (ABVR) flag value and precision representation information of a block vector; and a step of generating a bit string syntax representing the prediction vector based on the unit of precision.

In paragraph 15,
Further comprising a step of compensating the above prediction vector,
The step of obtaining the above reference block is characterized in that it operates based on the above compensation-corrected prediction vector,
The step of determining the above prediction mode is characterized in that the prediction mode is determined as a mode that uses refinement of the prediction vector,
An encoding method, characterized in that the step of generating the bit string syntax generates a bit string syntax representing a prediction vector before the compensation correction.

In paragraph 15,
The step of determining the above prediction vector includes the step of determining a first prediction vector; and the step of determining a second prediction vector;
The step of obtaining the above reference block includes: a step of obtaining a first reference block based on the first prediction vector; and a step of obtaining a second reference block based on the second prediction vector;
The step of generating the prediction block is characterized in that the prediction block is generated through prediction fusion based on two or more reference blocks including the first reference block and the second reference block,
An encoding method, characterized in that the step of determining the prediction mode determines the prediction mode as a mode that allows two or more prediction vectors.

In paragraph 18,
The above first prediction vector is a block vector for prediction within the screen and is configured to indicate the location of the first reference block within an already encoded area of the current picture,
An encoding method, characterized in that the second prediction vector is a motion vector for inter-screen prediction and is configured to indicate the position of the second reference block within a picture in which encoding has been completed previously.

In a decoding device configured to decode a video bit stream encoded by a computer device,
processor;
memory;
An input section into which the above bit string is input;
An output section that outputs the decrypted video;
A reference buffer that stores information about at least one decoded picture, including the picture currently being decoded;
A bitstream parsing unit comprising a function of reading, from the bitstream, prediction mode information for prediction decoding of a current block of the currently decoded picture; determining whether the prediction mode information indicates a mode based on intra block copy (IBC); and reading, from the bitstream, at least one prediction vector designating a location of at least one reference block from at least one of the pictures based on the prediction mode information;
A prediction decoding unit comprising a function of dividing the current block into two or more sub-blocks; determining whether the prediction mode information indicates a mode using refinement of a prediction vector; and, if the mode uses refinement, individually compensating and refining the prediction vector for each sub-block based on the prediction vector; obtaining two or more sub-reference blocks individually for each sub-block from at least one picture stored in the reference buffer based on the prediction vector, combining each of the sub-reference blocks to obtain at least one reference block; generating a prediction block based on the at least one reference block; and performing prediction decoding on the current block based on the prediction block; and
A decoder device comprising a video decoding unit configured to decode the bit string based on the predicted decoding result.