Font Size: a A A

Research On Image And Video Compression Based On The Framework Of Variational Autoencoder

Posted on:2022-09-02Degree:MasterType:Thesis
Country:ChinaCandidate:Y J WuFull Text:PDF
GTID:2518306323966509Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Compression has been widely used as an effective means to compress media in-formation,reduce network bandwidth and storage resources.Traditional compression technologies improve around 50%performance every ten years by adding additional computational complexity.Up to now,the latest existing coding standards have in-tegrated many complex modules,and the difficulty of further improvement has been higher.Thanks to the development of deep learning,compression technology based on variational autoencoders has developed rapidly in recent years,and image compression schemes have caught up with the performance of the latest traditional codecs within a few years.And as a new compression framework,compression schemes based on variational autoencoders still have a lot of room to try to further optimize performance.Therefore,it is essential to explore the compression framework of variational autoen-coders.This paper focuses on the compression schemes based on variational autoen-coders for image and video media content,and considers the improvement of the current compression framework based on variational autoencoders from the perspecti ve of cod-ing performance and coding speed.The main contributions and innovations of the thesis are as follows:(1)This thesis proposes an image compression method based on 3D context and visual quality.Different from directly using the autoregressive model to model the spatial correlation for entropy coding,the proposed scheme designs a 3D autoregres-sive model to model the spatial correlation while modeling the correlation of the latent variable channel,thereby improving the performance.In addition,unlike the general scheme that uses the mean square error as the distortion optimization metric,the pro-posed scheme takes into account the difference between the distortion function and the subjecti ve quality,and constructs a weighted combination of multiple distortion metrics to guide the model training.(2)This thesis introduces a block-based acceleration method.Unlike most com-pression schemes that use an entire frame as a coding unit,this method uses the block as the coding unit to achieve image compression.Block compression brings good par-allelism to the model,which can improve the speed of compression.At the same time,in order to reduce the impact of blocking on performance,this solution introduces pre-diction and post-processing modules.Experiments show that the proposed scheme has a 4.1%performance improvement compared with VTM 8.0,and it has a speed increase of about ten times compared with the existing work.(3)This thesis presents a video compression framework,called Memorize-Then-Recall.Different from traditional schemes,we use the structural information of the video from the semantic perspective to decompose a set of video frames into global fea-tures containing appearance information and skeleton features containing motion infor-mation.During reconstruction,we use the attention mechanism to fuse the information and combined it with the generative adversarial networks to achieve frame reconstruc-tion.Experimental results show that the framework achieves better performance than H.265 on human motion videos.
Keywords/Search Tags:Image compression, Video compression, Variational Auto-Encoder, Autoregressive Model, Acceleration Algorithm, Attention Mechanism, Generative Adversarial Network
PDF Full Text Request
Related items