Font Size: a A A

Research On Deep Learning-Based Fractional Motion Compensation

Posted on:2021-03-07Degree:DoctorType:Dissertation
Country:ChinaCandidate:N YanFull Text:PDF
GTID:1368330602994252Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
With the development of communication,big data and multimedia technology,multimedia application has played a more and more important role in our life.The ubiquity of mobile devices and increase in video definition have made the the amount of video on the internet increases tremendously.In the recent years,artificial intelli-gence techniques like deep learning,has achieved great progress in the field of image processing,computer vision and natural language processing.Deep neural network has a very strong ability for non-linear representation,and can be optimized in an end-to-end manner.Therefore,how to combine deep learning and video coding to further improve the efficiency of video coding is a very valuable research direction.In video coding,motion compensation-based inter-prediction technique is utilized to reduce the temporal redundancy in video,thus reducing the bit-rate of the coding block.Due to digital sampling,the actual object motion usually cannot align with the sampling grid,so in such cases it is difficult to find accurate matched block.To tackle the problem,fractional motion compensation is introduced to video coding.Interpola-tion filter is used to interpolate the fractional picture from the integer picture,the derived fractional picture is adopted for motion compensation.In the traditional fractional mo-tion compensation,simple finite impulse response filters are usually used.Such fixed linear filters have lower implementation complexity,but cannot effectively handle the prevalent non-linear and non-stationary property of video signals.Inspired by the suc-cess of deep learning in the field of image processing,this dissertation investigates how to utilize deep learning to improve the coding efficiency of fractional motion compen-sation.The main innovations and contributions of this dissertation are listed as follows.1.This dissertation proposes the first CNN-based fractional interpolation technique.The supervised training of convolutional neural network need to pre-determine the input and target of the network,i.e.integer picture and fractional picture in this dissertation.However,due to the unavailability of fractional samples after digital sampling,the training data is thus not available.To tackle the unavailabil-ity of fractional samples,this dissertation first analyzes the formation principle of optical image,and proposes a fractional sample generation algorithm based on Gaussian low-pass filtering and poly-phase decimation.In addition,lossy coding is usually used in video coding,therefore there exist compression noises in the reference pictures.To deal with the compression noise existing in the reference picture,this dissertation proposes a quantization parameter-based training data generation method.Furthermore,this dissertation proposes to train more effi-cient interpolation filter with CNN.This dissertation prove the effectiveness of deep learning-based fractional interpolation.2.This dissertation proposes a inter-picture regression-based fractional motion com-pensation technique.The purpose of fractional motion compensation is to im-prove the accuracy of inter-prediction,therefore,fractional motion compensation is formulated as a inter-picture regression problem,which is to predict the pixel values of the current to-be-coded picture from the integer-pixel values of a refer-ence picture.This dissertation further proposes to solve the regression problem with CNN training.In HEVC,bi-directional prediction is adopted,which uses the average of two reference block as the final prediction.This dissertation pro-poses a generalized fractional interpolation model,which regards fractional inter-polation in bi-prediction as a dual mapping problem,that is mapping the integer reference blocks to the current block.To solve the fractional interpolation in bi-prediction,this dissertation design a iterative algorithm and simplifies the dual mapping problem into two unitary problems.This dissertation then investigates how to train the CNN model by using encoded video sequences.Moreover,this dissertation proposes to integrate the trained CNN models into the high efficiency video coding(HEVC)scheme,and performs a comprehensive set of experiments to evaluate the effectiveness of the proposed method.3.This dissertation proposes the invertibility-driven interpolation filter method.This dissertation firstly analyzes the spatial duality of integer and fractional pixels,and reveals the invertibility inherent in the fractional interpolation problem,that is an ideal interpolation filter not only can perfectly interpolate the fractional samples from integer samples but also perfectly interpolate the integer samples from the fractional samples.Then the theoretical interpretation is provided from the per-spective of signal processing.Based on the invertibility,this dissertation proposes the unsupervised training algorithm and design an end-to-end training scheme.Two kinds of loss functions are proposed in this dissertation,including invertible reconstruction loss function and fractional motion regularization loss function.The proposed training scheme does not need hand-crafted“ground truth”of fractional samples and overcomes the drawback of the previous learning-based interpolation filter method.
Keywords/Search Tags:Deep Learning, Convolutional Neural Network, Inter Prediction, Frac-tional Interpolation, Motion Compensation
PDF Full Text Request
Related items