Font Size: a A A

Research On Accurate Prediction Models For Video Coding

Posted on:2021-05-20Degree:DoctorType:Dissertation
Country:ChinaCandidate:C Y MaFull Text:PDF
GTID:1368330602994245Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
In recent years,the rapidly developing video applications such as high-definition television,live video,and video surveillance demands the video technology with higher compression efficiency.The current international video coding standard,High Efficien-cy Video Coding(HEVC),is based on a hybrid video coding framework,which mainly includes block-based intra/inter prediction,transformation,quantization,entropy cod-ing,and loop filter.Under the assumption the video signal is stationary,each module in HEVC is designed manually according to the signal processing theory.However,the actual video signal is often a non-stationary signal,which causes each module design is not so accurate,thereby cannot remove the redundancy among video signals efficiently and limit the video coding performance improvement.On the one hand,multiple reference frames which are near to current frame are adopted as reference content for HEVC inter prediction.The content in multiple ref-erence frames is usually highly similar,which causes a waste of the limited reference buffer space.In addition,in traffic surveilllance video,similar vehicles appear repeated-ly,and the background remains stable for a long time.However,the multiple reference frame mechanism cannot effectively utilize this prior characteristic;One the other hand,the discrete cosine transform(DCT)or discrete sine transform(DST)adopted in HEVC intra coding is a non-optimal linear transform,there may still be some correlations a-mong coefficients in current transform block.In addition,considering that HEVC intra prediction is just simple angular prediction,the correlations between current transform block coefficients and neighboring transform block coefficients may still exist.Aiming at the the problems mentioned above,this paper proposes a reference content modeling-based inter prediction coding scheme and deep learning modeling-based transform co-efficients prediction coding scheme,where the former contains an image patch-based inter prediction reference information model which is designed for generic video coding and a library-based foreground and background reference information model which is designed for traffic surveillance video,the latter contans a deep learning-based transfor-m coefficients entropy coding scheme and a deep learning-based transform coefficients prediction scheme.The main innovations and contributions of this dissertation are listed as follows.(1)Aiming at the possible waste of reference buffer space in the multiple reference frame mechanism in HEVC,this paper proposes an image patch-based inter prediction reference information model.Within the limited reference buffer space,part of the ref-erence frame space is used to store image patchs(sub-pictures taken from the reference frame),which have a smaller granularity than the reference frame.Besides balancing the video signal noise,the reference buffer space can include more exposed parts of oc-cluded objects,which can build a more accurate reference information model for inter prediction.In order to achieve the image patch-based inter prediction reference infor-mation model,we design the corresponding image patch generation,image patch man-agement and image patch utilization modules in this paper.We implement the image patch-based inter prediction reference information model into the HEVC coding frame-work.Experimental results show that the proposed image patch-based inter prediction prediction information model can significantly improve video coding performance.(2)This paper proposes a library-based foreground and background reference in-formation model based on the observed long-term repetition characteristics of fore-ground and background in traffic surveillance video.In the early stage of the traffic surveillance video,we extract the backgrounds and the vehicles,remove the redundan-cy among the backgrounds and the vehicles,and save the remaining vehicles and back-grounds in the library.In the later traffic surveillance video coding,the vehicles and background segmented from the video frame could retrieve the matching vehicles and background in the library,which could provide more accurate reference information for the video frame to be encoded.We implement the library-based foreground and back-ground reference information model into the HEVC coding framework.Experimental results show that the library-based foreground and background reference information model proposed for the traffic surveillance video can significantly improve the traffic surveillance video sequence coding performance.(3)Aiming at the correlations that still exist among transform coefficients in HEVC,this paper proposes a deep learning modeling-based transform coefficients prediction coding scheme.As the correlations among transform coefficients are difficult to de-scribe by traditional methods,this paper utilizes neural networks of the deep learn-ing technology to accurately model the correlations among the coefficients in current transform block,and the correlations between curent transform block coefficients and neighboring transform block coefficients.Specifically,by using neural network to esti-mate the probability distribution of each intra prediction residual transform coefficient,this paper proposes a deep learning-based intra prediction residual transform coefficient entropy coding scheme;by using neural network to estimate the value of each intra pre-diction residual transform coefficient,this paper proposes a deep learning-based intra prediction residual transform coefficient prediction scheme.We implement the above-mentioned deep learning-based intra prediction residual transform coefficient entropy coding scheme and prediction scheme into the HEVC coding framework.Experimental results show that both schemes can obviously improve the video coding performance.
Keywords/Search Tags:High Efficiency Video Coding, multiple reference frames, traffic surveilance video coding, transform coefficients correlations, deep learning
PDF Full Text Request
Related items