Temporal Motion Prediction For Fast Motion Estimation in Multiple Reference Frames

2006 IEEE International Symposium on Signal Processing and Information Technology
Temporal Motion Prediction for Fast Motion Estimation in Multiple Reference Frames
G Nageswara Rao, PSSBK Gupta Emuzed A Flextronics Company, Bangalore, India Email: {nageswararao.gunupudi, shyam.pallapothu}@flextronicssoftware.com
H.264/MPEG-4 Part 10, motion compensation is allowed to use multiple reference frames that improves the rate distortion performance but at the cost of drastic increase in complexity. The increased computation is in proportion to the number of searched reference frames. However, the reduction of prediction residues is highly dependent on the nature of sequences. In this paper, we present a fast technique to predict the motion vector in reference frames to speed up the matching process for multiple reference frames. The proposed technique is based on choosing the motion centre and carrying the search around the centre with a radius of 1 or 2 pixels in all reference frames which exception of the one which immediately precedes the current frame. For the reference frame that immediately precedes the current frame any motion estimation technique can be used. The results show that the proposed technique reduces the computational requirements down to that required for single reference frame motion estimation with only a negligible loss of objective quality.
Abstract--- In the new video coding standard,
Index terms--- H.264/AVC, Motion Estimation, Multi Frame Motion Compensation (MFMC)
I. INTRODUCTION H.264/AVC is the newest video coding standard [1] of the ITU-T Video Coding Experts Group and the ISO/IEC Moving Picture Experts Group. H.264/AVC introduced lot of new coding and error resilient tools and as a result it has achieved a significant improvement in rate-distortion efficiency relative to existing standards, up to 50% of bitrate reduction is achieved over MPEG-4 advanced simple profile. Motion compensation process defined in H.264 at quarter-pixel accuracy with variable block sizes and multiple reference frames greatly reduces prediction errors. Multi frame motion compensation (MFMC) is one such tool that provides significant coding gain and better error robustness as well. The multi-frame buffer stores frames at encoder and decoder those are efficient for motion-compensated prediction. Multi-frame motion estimation was proposed as a technique to improve the error-resilience of compressed video by Budagavi and Gibson [2]. The multi-frame motion compensation (MFMC) coder makes use of the redundancy that exists across multiple frames in typical videoconferencing sequences to achieve additional compression over that obtained by using the single frame motion
compensation (SFMC) approach [3]. The advantage of the MFMC approach is that it is more robust to error propagation when compared to the traditional SFMC, as it makes use of information from multiple frames. As a consequence of many prediction tools in motion compensation the motion estimation process in H.264 becomes further more complex. The Multi-frame motion compensation improves the rate distortion performance substantially by introducing much higher loading to the system. Without considering temporal correlations between multiple reference frames, conventional single-frame search algorithms can still be applied to multi-frame motion estimation, but using a rather inefficient frame-by-frame approach. But it increases the complexity of Motion Estimation (ME) by number of reference frames times to that of single frame motion estimation, as ME needs to be carried for all reference frames. This is not feasible for most cases in real time implementation. Nevertheless, the decrease of prediction residues depends on the nature of sequences. Sometimes the prediction gain is very significant, but sometimes a lot of computation is wasted without any considerable bit rate reduction. As a consequence several fast techniques are developed recently for motion estimation in multiple reference frames. We present an effective algorithm to accelerate the multiple reference frames ME without significant loss of video quality. In this paper we proposed an algorithm, which makes use of the already computed motion vectors with respect to the first reference frame i.e. the reference frame closer to the current reference frame for prediction of best motion vectors for second and subsequent reference frames. The proposed scheme is to minimizing the computational overheads resulting from motion estimation of reference frames other than first reference frame. However the proposed technique is independent of ME algorithm used for first reference frame. The main goal of the algorithm is to perform a fast search in all reference frames other than first reference frame while maintaining a PSNR quality similar to that would be obtained when full search block-matching algorithm (FSBMA) is used in each reference frame. The rest of the paper presents various statistics of multi-frame motion estimation (Section II), proposed motion vector prediction system and algorithm for motion estimation (Section III & IV), simulation results (Section V) and conclusions (Section VI).
0-7803-9754-1/06/$20.002006 IEEE
817
Table.1. Quality (PSNR) and Complexity for Multiple reference frame Motion Estimation II. MULTI FRAME MOTION ESTIMATION Multi-frame motion estimation extends the temporal displacement vectors utilized in the block-matching video coding by permitting the use of more frames than the one that previously decoded for the motion-compensated prediction. The use of multiple frames for the motion estimation in many cases provides significant improvement in coding gain [5] and also provides better error robustness. It is well agreed that motion estimation is the complex module of standard based video encoders. And the complexity increases N times with motion vector search in N reference frames. Nevertheless, the decrease of prediction residues depends on the nature of video content. The newly proposed standard H.264 supports the hybrid block motion compensation with multiple reference frames, which significantly increases the complexity of video encoders for real time implementations. Given an inter-mode, the reference software JM9.5 adopts a full search and carries out the matching process in all reference frames one by one. The best mode is chosen by minimizing a Lagrangian cost function, which considers both 2-D 4x4 Hadamard transformed SAD (SATD) and number of bits required to code the side information. Table.1 gives the PSNR results and complexity of ME in seconds for different sequences with one, two and three reference frames at bit rate of 512 Kbps. Table.2 shows the percentage of references from first and other reference frames. It is observed that the number of macro blocks that are referenced in farthest reference frames decreases. It also can be observed that the number of references from all other reference frames is less than that of the first reference frame. Table.2 shows that maximal probability for references for motion compensation are from the first reference frame except for Salesman sequence, but the references from the other reference frames is also not insignificant, which shows the scope for designing fast and accurate ME techniques in secondary reference frames. A PSNR loss is observed in salesman sequence (Table.1) from two reference frames to three reference frames, which is justified by its reference frame statistics shown in Table.2 where increase in overhead bits for reference frame index signaling in three reference frames case is not compensated by its residual error gains. Frames and Motion Vectors classification in MFMC: The statistics and observations presented in the previous section clearly gives a specific importance for the reference frame that temporally precedes the current frame and motion vectors corresponding to that reference frame. Hence here a classification for the reference frames and motion vector used in MFMC is presented. We termed the reference frame that immediately precedes the current frame as primary reference frame (PRF) and rest all as secondary reference frames (SRF). Similarly the motion vectors obtained with respect primary reference frames are termed as primary motion vectors (PMV) and rest all secondary motion vectors (SMV). This classification of primary and secondary reference frames and motion vectors adds clarity for motion estimation process in terms of complexity-accuracy tradeoffs. The proposed technique considers the motion estimation in secondary reference frames And the motion estimation algorithm in primary reference frames is allowed to choose any fast technique. III. TEMPORAL PREDICTION OF MOTION VECTOR The temporal prediction algorithm proposed here is based on the fact that the motion of macro blocks can be tracked temporally over the frames through the primary motion vectors in the frames in between current frame and reference frame. The best match for a block in current frame with a block in a reference frame can be tracked temporally by adding the motion vectors in the every two neighboring frames that are between the current frame and reference frame i.e. primary motion vectors of frames between the current frame and reference frame.
Table.2. References frames statistics with 10 reference frames
818
Fig.1 It is empirically observed that the motion across the frames is linear and smooth. Hence tracking the motion between every two frames using primary motion vectors and summing up the motion vectors of every adjacent frame between the current frame and the secondary reference can identify best match of a block in current frame Lets say the current block is at (x, y) in current frame and found to be best matched at (x1, y1) location in primary reference frame, i.e (x x1, y y1) is the primary motion vector (PMV) of the block. Then it can be strongly justified that the block (x, y) can be closely matched with the block that is best matched for (x1, y1). But the block (x1, y1) may not be aligned with a block boundary of the reference frame at T-1, (x1, y1) is approximated to nearest block boundary (xb1, yb1) and motion vector of block (xb1, yb1), say (mvx2, mvy2), is used as the predicted motion vector for the reference frame at T-2. If (xb1, yb1) falls out of the frame boundary, it is approximated to nearest block position. Then it is observed that the best match for the current block is more likely to around the (x2, y2) in first secondary reference frame. Hence the motion search for the current block in its first secondary reference frame can be carried out around (x2, y2). Thus search center of motion estimation in first secondary reference frame can be formulated as follows. Fig.2 explains the temporal motion tracking system for secondary reference frames. (mvx2, mvy2) = PMV of (xb1, yb1) (x2, y2) = (xb1 + mvx2, yb1 + mvy2), and xb1 = { (x1 + BlockSize/2) >> B }* BlockSize yb1 = { (y1 + BlockSize/2) >> B } * BlockSize where B = log2 (BlockSize) (1) The statistics to show deviation of predicted motion vector from the finally best motion vector in secondary reference frame are given for foreman and mobile sequences in Fig.1. It is showing that about 80% to 90% of predicted motion vectors are falling with in 2 pixels displacement from the best motion vector and 60% of predicted motion vectors are falling with in 1 pixels displacement from the best motion vector. Thus motion search around the predicted motion vector calculated as described in Eq.1 with search range of 2 pixels is sufficient for finding the best match of the given macro block in the given reference frame. The PSNR results are presented for fast motion estimation with search range of 2 and 1 (referred as 1x1 and 2x2 searches respectively) are given Table 3. To further speed up the computation of motion vector of secondary reference frames, we have chosen a iterative pattern with 1 search area, where the search is carried first in 1 search area and complete the search if the best motion vector is found at center of search area, other continue search in 1 range for best motion vector around the best motion vector position of the current search area. Table 3 also shows the results for the iterative search strategy with search window 1 pixels with two iterations. The early exit of iterative technique enables much faster convergence of motion estimation in secondary reference frames. Fig.3 gives various alternative search patterns that can be applied for search around search center.
(x2, y2)
T2 (SRF) T1 (PRF) T
(x1, y1) (x, y)
Fig.2. Motion vector tracking over the frames to predict in Motion vector in secondary reference frames
Fig.3. Search Pattern for Secondary reference frames Motion estimation
819
Table.3. Quality (PSNR) and Complexity of different fast MFMC schemes proposed with two reference frames IV. PROPOSED ALGORITHM The observation made in previous sections lead to a fast motion estimation algorithm for secondary reference frames. This section presents the fast motion estimation algorithm for secondary reference frames. The seed point for the motion search is obtained from the primary motion vectors (PMVs) of the frames that are between the current frame and reference frame under ME search. However to include different motions and make algorithm more robust, we include zero motion vectors (ZMV) and predicted motion vector defined in H.264 standard (SPMV) in seed point selection. Among the ZMV, temporally predicted motion vector, and standard predicted motion vector (SPMV) the motion vector which result in least rate-distortion cost is selected as seed point for motion search and iterative search is with in 1 search area is carried about the seed point. The algorithm for motion estimation with multiple reference frames is summarized bellow. Step1: Perform the motion estimation in primary reference frame (Any fast technique would be applied). And store the primary motion vectors. Step2: Compute temporal motion vector predictor using the primary motion vectors of the frames between the current frame and reference frame under search. Find the seed for the ME in first secondary reference frame as the primary motion vector of the block in the primary reference frame which is maximally covering the best match of the current block of the current frame. Step3: Choose the best seed among zero motion vector, standard predicted motion vector and temporally predicted motion vector derived in step 2 by minimizing the R-D cost. Step4: Search the best match for the current block in secondary reference frame around the best seed computed in step3 with radius of 1. Step5: If the best MV resulted in step 4 is the search center i.e seed point, then decide that as the best MV for the current block and go to step 6. Otherwise the best MV in the 1 search is then chosen as search center and continue best MV search about new search center with 1 search range. Step6: Repeat step 2 to step 5 for all the secondary reference frames. V. SIMULATION RESULTS JM9.5 reference software is used for generation of simulation results for the proposed algorithm. All the results are generated in Baseline profile for CIF (352x288) resolution at 512 kbps and at frame rate of 15fps. The same techniques are verified with different bit rates. Foreman, Hall, football, Claire, Flower, and Mother and Daughter sequences are used. Only I and P frames are used and GOV length of 60 is used. Search range of 32, Inter prediction modes up to 8x8 blocks are used. Table.3 give the comparison of PSNR quality and Motion estimation time in seconds with full search algorithm and proposed techniques with two reference frames being used for motion compensation. Full search block matching algorithm is used for motion estimation in primary reference frame. VI. CONCLUSIONS A fast technique for predicting motion in multiple reference frames is presented. The proposed algorithm scales down the complexity of motion estimation in multiple reference frames scenario close to that required for single reference frame motion estimation with negligible loss of PSNR. The proposed scheme minimizes the memory traffic for hardware implementations as the search area is minimized in motion estimation. REFERENCES
[1] Joint Video Team of ITU-T and ISO/IEC JTC 1, ITU-T Rec. H.264 ISO/IEC 14496-10 AVC, March 2003. [2] M. Budagavi and J. Gibson, Multi-frame Block Motion Compensated Video Coding for Wireless Channels, in Thirtieth Asilomar Conf. on Signals, Systems, and Computers, vol. 2, pp.953-957, Nov. 1996. [3] T. Wiegand, G. J. Sullivan, G. Bjontegaard and A. Luthra, Overview of the H.264 / AVC Video Coding Standard, IEEE Transactions on Circuit and Systems for Video Technology,VOL. 13, NO. 7, July 2003
[4] Yi-Hon Hsiao, Tien-Hsu Lee, Pao-Chi Chang: Short/long-term motion vector prediction in multi-frame video coding system. ICIP 2004: 1449-1452.
[5] T.Wiegand, X.Zhang, and B.Girod Long-term memory motion compensated prediction, IEEE Trans. Circuits Syst. Video Technol.., vol 9, no.1, pp.7084, Feb. 1999.
820

Temporal Motion Prediction For Fast Motion Estimation in Multiple Reference Frames

Caricato da

Informazioni sul documento

Descrizione originale:

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Temporal Motion Prediction For Fast Motion Estimation in Multiple Reference Frames

Caricato da

Copyright:

Formati disponibili

2006 IEEE International Symposium on Signal Processing and Information Technology

Abstract--- In the new video coding standard,

Table.2. References frames statistics with 10 reference frames

(x1, y1) (x, y)

Fig.3. Search Pattern for Secondary reference frames Motion estimation

Potrebbero piacerti anche