Font Size: a A A

Research On Inter Prediction For Video Compression

Posted on:2021-02-26Degree:DoctorType:Dissertation
Country:ChinaCandidate:J MaoFull Text:PDF
GTID:1368330614967746Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of information technology,videos have become the mainly data on the Internet.The demand for wide color gamut,high quality,and high resolution video is in-creasing,which imposes a great burden on the storage and transmission of video data.Application requirements raise demand of higher video compression efficiency.Improving coding efficiency is one of the most important researches in the world.This article is devoted to improving the cod-ing efficiency of inter prediction in video coding,and mainly focus on two aspects:improving the prediction accuracy of motion vectors and improving the prediction accuracy of prediction blocks The main works and contributions include the following aspects1.Motion vector(MV)prediction based on virtual motion vector is proposed to solve the problem of insufficient candidates of motion vector predictor during long-term reference.As long-term MV(LMV)has low correlation with short-term MV(SMV),in video coding technology,the cross-prediction between LMV and SMV is disable,which would result in insufficient candidates from neighboring blocks.To enrich MV candidates,the reconstructed pixels are used to derive MV of the absent type by motion estimation,and the MV derived is called virtual MV.Therefore,there are LMV and SMV after each coding block is reconstructed.For following blocks,MV of the same type from any neighboring block can be used as MV predictor.As real MV is more reliable than virtual MV,a reliability based motion vector prediction method is proposed to construct a motion vector candidate list,achieving 1%bit-rate saving2.We propose an adaptive weighted bi-prediction based on similarities of spatial neighboring pixels to improve the accuracy of bi-prediction blocks.In the bi-prediction of the merge/skip mode,we found that one reference block is more similar to the current block than the other one,and the optimal weighting factor are in uniform distribution among all candidates,which implies that a specific weighting factor is not good for all blocks.Therefore,an adaptive weighted bi-prediction is proposed.Theoretical analysis show that there is a logarithmic relationship between the ratio of two similarities and the optimal weighting factor.As current block is unknown,the similarity between spatial neighboring pixels of current block and that of two reference blocks is used to estimate the similarity between current block and two reference blocks.By assigning larger weight to more similar reference blocks,the coding performance can be improved to 0.5%.3.We propose a bi-prediction based on convolutional neural networks(CNN)using spatial information.In previous adaptive weighted bi-prediction,pixels inside one block share the same weight factors.When there are occlusions and illumination changes between video frames,block-level adaptive weighted bi-prediction would generate structure-related predictive residual.We pro-pose CNN-based bi-prediction for pixel-wise fusion by utilizing patch-level information.More-over,our previous work shows that spatial information is beneficial for improving the bi-prediction efficiency.Therefore,spatial neighboring pixels,and reference block information are used as net-work inputs,and network will output the final prediction block.The spatial neighboring pixels have the following advantages:1)spatial neighboring pixels of the current block help to improve the prediction accuracy around the block boundary;2)estimate the similarities between current block and two reference blocks,and the prediction accuracy can be improved by assigning a larger weight to the more similar pixels;3)estimate temporal variation between the current block and reference block to refine the reference block,and the prediction accuracy would be improved.The compression performance has been improved to 3%.4.We propose a CNN based bi-prediction using spatio-temporal information.In common cod-ing structure,two reference blocks are located in the opposite or same direction relatively to the current block.With temporal information,CNN-based bi-prediction can handle extrapolation and interpolation uniformly.Moreover,based on the fact that the more adjacent frames have higher cor-relation,temporal information helps to improve prediction efficiency.Furthermore,we explore the necessity of applying STCNN-based bi-prediction by rate-distortion optimization in different inter modes,achieving 5%bit-rate saving.We combine bi-direction optical flow with STCNN-based bi-prediction to improve compression efficiency.On the one hand,the proposed virtual motion vector provides more accurate motion vector predictor in long-term reference,which improves coding efficiency of motion vector.On the other hand,this paper proposes block-level and pixel-level adaptive weighted bi-prediction to enhance the accuracy of predictive pixels.
Keywords/Search Tags:Video coding, motion vector prediction, intra-inter prediction, bi-prediction, deep learning
PDF Full Text Request
Related items