Video Coding Using Motion Compensation: (Chapter 9 - Continues)

Video Coding Using Motion Compensation
(Chapter 9 continues)
Yao Wang Polytechnic University, Brooklyn, NY11201
Outline
Block-Based Hybrid Video Coding

Overview of Block-Based Hybrid Video Coding Overlapped Block Motion Compensation Coding mode selection and rate control Loop filtering
Characteristics of Typical Videos
Frame t-1
Frame t
Adjacent frames are similar and changes are due to object or camera motion
Key Idea in Video Compression
Predict a new frame from a previous frame and only code the prediction error Prediction error will be coded using the DCT method Prediction errors have smaller energy than the original pixel values and can be coded with fewer bits Those regions that cannot be predicted well will be coded directly using DCT Work on each macroblock (MB) (16x16 pixels) independently for reduced complexity
Motion compensation done at the MB level DCT coding of error at the block level (8x8 pixels)
Temporal Prediction
No Motion Compensation:
Work well in stationary regions
f (t , m, n) f (t 1, m, n)
Uni-directional Motion Compensation:
Does not work well for uncovered regions by object motion
f (t , m, n) f (t 1, m d x , n d y )
Bi-directional Motion Compensation
Can handle better covered/uncovered regions
f (t , m, n) wb f (t 1, m db, x , n db, y ) w f f (t 1, m d f , x , n d f , y )
Encoder: Typical Block-Based Hybrid Video Coder
Block-based: each frame divided into blocks of fixed size Hybrid: motion-compensated temporal prediction and transform coding VLC: variable-length coding (Hoffman)
6
Decoder Block Diagram
Typical Block-Based Hybrid Video Coder
If temporal prediction is successful, then prediction error block needs fewer bits than the original block. Called P-mode If not, code the original block directly using transform coding. Called intra-mode If use bidirectional prediction, then call B-mode Both B-mode and P-mode are inter-mode. Hence, I-frame, P-frame, B-frame Blocks for motion estimation and larger than blocks for DCT and called macroblocks (MB) Number of MBs form group of blocks (GOB) or slice, several GOBs form a picture.
Different Coding Modes
MB Structure in 4:2:0 Color Format
4 8x8 Y blocks
1 8x8 Cb blocks
1 8x8 Cr blocks
10
Block Matching Algorithm for Motion Estimation
MV
Search Region
Frame t-1 (Reference Frame)
Frame t (Predicted frame)

11
Macroblock Coding in I-Mode
DCT transform each 8x8 DCT block
Quantize the DCT coefficients with properly chosen quantization matrices
The quantized DCT coefficients are zig-zag ordered and run-length coded
12
Macroblock Coding in P-Mode
Estimate one MV for each macroblock (16x16)
Depending on the motion compensation error, determine the coding mode (intra, inter-with-no-MC, inter-with-MC, etc.)
The original values (for intra mode) or motion compensation errors (for inter mode) in each of the DCT blocks (8x8) are DCT transformed, quantized, zig-zag/alternate scanned, and run-length coded
13
Macroblock Coding in B-Mode
Same as for the P-mode, except a macroblock can be predicted from a previous picture, a following one, or both.
vf
vb
14
Overlapped Block Motion Compensation (OBMC)

Conventional block motion compensation
One best matching block is found from a reference frame The current block is replaced by the best matching block
OBMC
Each pixel in the current block is predicted by a weighted average of several corresponding pixels in the reference frame The corresponding pixels are determined by the MVs of the current as well as adjacent MBs The weights for each corresponding pixel depends on the expected accuracy of the associated MV
( x), r ( x), p ( x) - coded, reference, predicted frame
15
OBMC Using 4 Neighboring MBs
Should be inversely proportional to the distance between x and the center of
16
Optimal Weighting Design

Convert to an optimization problem:
Minimize
Subject to
Optimal weighting functions (solution):

(R autocorrelation matrix, r cross-correlation vector)
17
How to Determine MVs with OBMC
Option 1: using conventional block-matching method (BMA), minimize the prediction error (MAD) within each MB independently Option 2: minimize the prediction error assuming OBMC
Solve the MV for the current MB while keeping the MVs for the neighboring MBs found in the previous iterations
18
Weighting Coefficients Used in H.263
19
Window Function Corresponding to H.263 Weights for OBMC
20
Rate Control
Rate control:
The coding method necessarily yields variable bit rate Rate control is necessary when the video is to be sent over a constant bit rate (CBR) channel, where the rate when averaged over a short period should be constant The fluctuation within the period can be smoothed by a buffer at the encoder output
Problem of rate control:

Step 1) Determine the target rate at the frame, GOB, and MB level, based on the current buffer fullness Step 2) Satisfy frame level target rate by varying frame rate (skip frames when necessary) Step 3) Satisfy GOB/MB level target rate by varying the coding mode at each MB
21
Video Coding Standards

(Chapter 13)
Yao Wang Polytechnic University, Brooklyn, NY11201 22
Outline
Overview of Standards and Their Applications ITU-T (International telecommunication Union) Standards for Audio-Visual Communications
H.261 H.263 H.263+, H.263++
ISO (International Organization for Standardization) Standards for

MPEG-1 MPEG-2 MPEG-4 MPEG-7
23
Standardization
ITU International Telecommunication Union 1865 International radiotelegraph Convention signed - 1906 CCIF International Telephone Consultative Committee 1924 CCIT International Telegraph Consultative Committee 1925 CCIR International Radio Consultative Committee 1927 CCIT + CCIF merge -> CCITT 1956, published H.261 in 1989 CCIR -> ITU-R; CCITT -> ITU-T - 1992 ITU-T has Study Groups (SG), SG 16 multimedia; SG divided into into Working Parties (WP) each dealing with several Questions
http://www.itu.int/home/index.html
24
Standardization
International Electromechanical Commission (IEC) 1906 International Organization for Standardization (ISO) 1947 Joint ISO/IEC Technical Commission 1 (JTC1) on Information Technology Subcommittee 24: Computer Graphics and Image Processing VRML Subcommittee 26: Coding - MPEG
http://www.iso.org/iso/en/ISOOnline.frontpage
25
Multimedia Communications Standards and Applications
Standards
Application
Video Format
Raw Data Rate
Compressed Data Rate >=384 Kbps >=64 Kbps >=64 Kbps >=18 Kbps 1.5 Mbps 3-10 Mbps 28-1024 Kbps
H.320 (H.261) H.323 (H.263) H.324 (H.263) MPEG-1 MPEG-2 MPEG-4 GA-HDTV MPEG-7
Video conferencing over ISDN Video conferencing over Internet Video over phone lines/ wireless Video distribution on CD/ WWW Video distribution on DVD / digital TV Multimedia distribution over Inter/Intra net HDTV broadcasting Multimedia databases (content description and retrieval)
CIF QCIF 4CIF/ CIF/ QCIF QCIF CIF CCIR601 4:2:0 QCIF/CIF SMPTE296/295
37 Mbps 9.1 Mbps
9.1 Mbps 30 Mbps 128 Mbps
<=700 Mbps
18--45 Mbps
26
H.261 Video Coding Summary

H.261 is an 1990 ITU video coding standard originally designed for transmission over ISDN lines on which data rates are multiples of 64 kbit/s. The data rate of the coding algorithm was designed to be able to operate between 40 kbit/s and 2 Mbit/s. The standard supports CIF and QCIF video frames with luma resolutions of 352x288 and 176x144 respectively (and 4:2:0 sampling with chroma resolutions of 176x144 and 88x72, respectively). History: H.261 was the first practical digital video coding standard. The H.261 design was a pioneering effort, all subsequent international video coding standards (MPEG-1, MPEG-2/H.262, H.263, and even H.264) have been based on its design. Design: The basic processing unit of the design is called a macroblock. Each macroblock consists of a 16x16 array of luma samples and two corresponding 8x8 arrays of chroma samples using 4:2:0 sampling and a YCbCr color space. The inter-picture prediction removes temporal redundancy, with MV used to help the codec compensate for motion. Transform coding using an 8x8 discrete cosine transform (DCT) removes the spatial redundancy. Scalar quantization is then applied to round the transform coefficients to the appropriate precision, the quantized transform coefficients are zig-zag scanned and entropy coded (using a Run-Level variable-length code) to remove statistical redundancy.
27
H.261 Video Coding Standard

For video-conferencing/video phone
Video coding standard in H.320 Low delay (real-time, interactive) Slow motion in general
For transmission over ISDN

Fixed bandwidth: px64 Kbps, p=1,2,,30
Video Format:
CIF (352x288, above 128 Kbps) QCIF (176x144, 64-128 Kbps) 4:2:0 color format, progressive scan
Published in 1990 Each macroblock can be coded in intra- or inter-mode Periodic insertion of intra-mode to eliminate error propagation due to network impairments Integer-pel accuracy motion estimation in inter-mode
28
H.261 Encoder
T transform Q - quantizer
Coding Control
p flag, INTRA/INTER t flag qz quantizer infication q quantizing index for transform coeff d motion vector f on/off loop filter
F: Loop filter (low pass the prediction; P: motion estimation and compensation
29
DCT Coefficient Quantization
DC Coefficient in Intra-mode: Uniform, step size=8

Others: Uniform with dead zone, step size=2~64 (MQUANT) Dead zone: [-T,T] To avoid too many small coefficients being coded, which are typically due to noise
30
Motion Estimation and Compensation
Integer-pel accuracy in the range [-16,16] Methods for generating the MVs are not specified in the standard
Standards only define the bit stream syntax, or the decoder operation)
MVs coded differentially (DMV) Encoder and decoder uses the decoded MVs to perform motion compensation Loop-filtering can be applied to suppress propagation of coding noise temporally
Separable filter [1/4,1/2,1/4] Loop filter can be turned on or off
31
Variable Length Coding
DCT coefficients are converted into run-length representations and then coded using VLC (Huffman coding for each pair of symbols)
Symbol: (Zero run-length, non-zero value range)
Other information are also coded using VLC (Huffman coding)
32
Parameter Selection and Rate Control
MTYPE macroblock type (intra vs. inter, zero vs. non-zero MV in inter) CBP coded block pattern (which blocks in a MB have non-zero DCT coefficients) MQUANT optional (allow the changes of the quantizer step size at the MB level) should be varied to satisfy the rate constraint MV (ideally should be determined not only by prediction error but also the total bits used for coding MV and DCT coefficients of prediction error) Loop Filter on/off
33
Formats supported
Note: H.261 encoder must support specific formats
34
H.263 Video Coding Summary

H.263 is a video codec originally designed by the ITU-T in 1995/1996 as a lowbitrate compressed format standard for videoconferencing. It is one member of the H.26x family of video coding standards in the domain of the ITU-T Video Coding Experts Group (VCEG). The codec was first designed to be utilized in H.324 based systems (PSTN and other circuit-switched network videoconferencing and videotelephony), but has since found use in H.323 (RTP/IP-based videoconferencing), H.320 (ISDN-based videoconferencing), RTSP (streaming media) and SIP (Internet conferencing) solutions as well. Most Flash Video content (as used on sites such as YouTube, Google Video, MySpace etc.) is encoded in this format, though some sites now use VP6 encoding, which is supported since Flash 8. H.263 video can be decoded with the free LGPL-licensed libavcodec library (part of the ffmpeg project) which is used by programs such as ffdshow, VLC media player and MPlayer. H.263 was developed as an evolutionary improvement based on experience from H.261, the previous ITU-T standard for video compression, and the MPEG-1 and MPEG-2 standards. Its first version was completed in 1995 and provided a suitable replacement for H.261 at all bitrates.
35
H.263 Video Coding Standard
H.263 is the video coding standard in H.323/H.324, targeted for visual telephone over PSTN or Internet Developed later than H.261, can accommodate computationally more intensive options Initial version (H.263 baseline): 1995 H.263+: 1997 H.263++: 2000 Goal: Improved quality at lower rates Result: Significantly better quality at lower rates Better video at 18-24 Kbps than H.261 at 64 Kbps Enable video phone over regular phone lines (28.8 Kbps) or wireless modem
36
Improvements over H.261

Better motion estimation half-pel accuracy motion estimation with bilinear interpolation filter Larger motion search range [-31.5,31], and unrestricted MV at boundary blocks More efficient predictive coding for MVs (median prediction using three neighbors) overlapping block motion compensation (option) variable block size: 16x16 -> 8x8, 4 MVs per MB (option) use bidirectional temporal prediction (PB picture) (option) 3-D VLC for DCT coefficients (runlength, value, EOB-end of block)
Syntax-based arithmetic coding (option)

4% savings in bit rate for P-mode, 10% saving for I-mode, at 50% more computations
The options, when chosen properly, can improve the PSNR 0.5-1.5 dB over default at 20-70 kbps range.
37
Prediction of MVs
Half-pixel motion compensation if needed
38
Prediction of MVs
Encoder needs to decide if for which MB 4 MV should be used: as extra bits needed for coding
39
PB-Picture Mode
PB-picture mode codes two pictures as a group. The second picture (P) is coded first, then the first picture (B) is coded using both the P-picture and the previously coded picture. This is to avoid the reordering of pictures required in the normal B-mode. But it still requires additional coding delay than P-frames only. In a B-block, forward prediction (predicted from the previous frame) can be used for all pixels; backward prediction (from the future frame) is only used for those pels that the backward motion vector aligns with pels of the current MB. Pixels in the white area use only forward prediction. An improved PB-frame mode was defined in H.263+, that removes the previous restriction.
40
Performance of H.261 and H.263
OBMC, 4 MVs, etc
Half-pel MC, +/- 32
Integer MC, +/- 16, loop filter Integer MC, +/- 32 Integer MC, +/- 16
Forman, QCIF, 12.5 Hz 41
H.323 protocols for multimedia
42
ITU-T Multimedia Communications Standards
/3
43
(multimedia communication over PSTN)
H.324 Terminal
44
Outline
Overview of Standards and Their Applications ITU-T (International telecommunication Union) Standards for Audio-Visual Communications
H.261 H.263 H.263+, H.263++
ISO (International Organization for Standardization) Standards for

MPEG-1 MPEG-2 MPEG-4 MPEG-7
45

Video Coding Using Motion Compensation: (Chapter 9 - Continues)

Caricato da

Informazioni sul documento

Descrizione originale:

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Video Coding Using Motion Compensation: (Chapter 9 - Continues)

Caricato da

Copyright:

Formati disponibili

Video Coding Using Motion Compensation

Yao Wang Polytechnic University, Brooklyn, NY11201

Block-Based Hybrid Video Coding

Characteristics of Typical Videos

Key Idea in Video Compression

Encoder: Typical Block-Based Hybrid Video Coder

Decoder Block Diagram

Typical Block-Based Hybrid Video Coder

Different Coding Modes

MB Structure in 4:2:0 Color Format

Block Matching Algorithm for Motion Estimation

Frame t-1 (Reference Frame)

Frame t (Predicted frame)

Macroblock Coding in I-Mode

DCT transform each 8x8 DCT block

Quantize the DCT coefficients with properly chosen quantization matrices

Macroblock Coding in P-Mode

Estimate one MV for each macroblock (16x16)

Macroblock Coding in B-Mode

Overlapped Block Motion Compensation (OBMC)

( x), r ( x), p ( x) - coded, reference, predicted frame

OBMC Using 4 Neighboring MBs

Should be inversely proportional to the distance between x and the center of

Optimal Weighting Design

Optimal weighting functions (solution):

How to Determine MVs with OBMC

Weighting Coefficients Used in H.263

Window Function Corresponding to H.263 Weights for OBMC

Problem of rate control:

Video Coding Standards

Yao Wang Polytechnic University, Brooklyn, NY11201 22

ISO (International Organization for Standardization) Standards for

Multimedia Communications Standards and Applications

Raw Data Rate

37 Mbps 9.1 Mbps

9.1 Mbps 30 Mbps 128 Mbps

H.261 Video Coding Summary

H.261 Video Coding Standard

For transmission over ISDN

DCT Coefficient Quantization

DC Coefficient in Intra-mode: Uniform, step size=8

Motion Estimation and Compensation

Variable Length Coding

Other information are also coded using VLC (Huffman coding)

Parameter Selection and Rate Control

Note: H.261 encoder must support specific formats

H.263 Video Coding Summary

H.263 Video Coding Standard

Improvements over H.261

Syntax-based arithmetic coding (option)

Half-pixel motion compensation if needed

Performance of H.261 and H.263

OBMC, 4 MVs, etc

Half-pel MC, +/- 32

Forman, QCIF, 12.5 Hz 41

H.323 protocols for multimedia

ITU-T Multimedia Communications Standards

(multimedia communication over PSTN)

ISO (International Organization for Standardization) Standards for

Potrebbero piacerti anche