Font Size: a A A

Image Video Compression Method Based On Deep Learning

Posted on:2021-11-04Degree:MasterType:Thesis
Country:ChinaCandidate:H WangFull Text:PDF
GTID:2518306050470904Subject:Circuits and Systems
Abstract/Summary:PDF Full Text Request
In the development of the past few decades,traditional image video coding has greatly improved the coding efficiency of image video by fully mining the correlation between intraframe and inter-frame images.However,with the rapid development of mobile communications and artificial intelligence,the coding requirements for image and video are increasingly diversified.Online live broadcasts,short videos,and other scenes place higher requirements on visual quality.In smart cities and other scenes,massive images and videos need to be transmitted to for intelligent analysis in the cloud,traditional image and video coding standards face new challenges.Therefore,it is necessary to explore a new technical approach to image and video compression.This paper proposes to study image and video compression methods based on deep learning.Based on this theme,we put forward our own solutions for different needs,which can be divided into image compression and video compression.Specifically,the main contributions and innovations of this paper are as follows:1.This paper builds an image compression network based on encoder-decoder architecture and implements end-to-end rate distortion optimization.Specifically,we constrain the conditional probabilities of the features to be encoded based on the context prior to the Gaussian distribution,thereby estimating the information entropy and optimizing them together with the distortion metric.The mean and variance of the Gaussian distribution are predicted from the side information representing the context.In the end,our method has comparable performance to the BPG method on the Kodak dataset.Under the same compression quality,the compressed file size is reduced by up to 40% compared to JPEG2000.For the problem that the existing deep learning image compression method can only get one bitrate with one model,this paper proposes a very simple but effective variable bitrate method.By constructing a conditional autoencoder,a control factor is introduced.The factor adjusts the divergence degree of the probability distribution of the feature to be encoded,and realizes the dynamic adjustment of the bitrate of single model.2.For scenarios that need to upload images from terminal devices to the cloud for visual task analysis,we propose a deep feature compression method for semantic fidelity.Specifically,we designed a lightweight feature compression module that can be embedded in a general convolutional neural network,extract the most important semantic features for the task and perform compression coding.In the classification task,the compressed file size of our method is reduced by more than 19 times compared with the signal fidelity compression method represented by HEVC.3.In video compression based on predictive coding framework,video inter-frame motion information needs to be estimated.As for the limitation of the receptive field in optical flow method,filter prediction method and other methods,this paper proposes a video frame prediction method based on deformable convolution,by extracting motion features between adjacent frames and then predicting the offset of the convolution kernel.This can achieve a larger receptive field without increasing the amount of calculation.On this basis,this paper has designed a simple video compression method,and obtained a performance equivalent to H.264/AVC on the HEVC standard test sequence in both PSNR and MS-SSIM metrics.
Keywords/Search Tags:Deep Learning, Feature Compression, Image Compression, Rate-distortion Optimization, Video Compression, Motion Estimation
PDF Full Text Request
Related items