Font Size: a A A

Image Compression Method Based On Multi-Scale Generative Adversarial Network

Posted on:2022-07-02Degree:MasterType:Thesis
Country:ChinaCandidate:H ZhangFull Text:PDF
GTID:2518306602494574Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
Image compression technology is an important technology in the field of computer vision.By reducing the redundant information in the image,which can effectively alleviate the transmission pressure and save storage resources.Due to the data growth brought about by the information age,the contradiction between massive image data and shortage of transmission resources has become increasingly prominent.Therefore,the development of an efficient image compression technology is particularly important for alleviating the pressure brought by image information.Traditional image compression algorithms rely on the manual design of the encoder/decoder framework,using fixed transformation to reduce the redundant information in the image,but at very low bit rate will appear image blur,artifact and other phenomena affecting the image quality.Generative adversarial network is one of the most promising deep learning algorithms in recent years,which uses generator and discriminator to compete against each other for training,and produces quite good results.Therefore,this thesis adopts the method of deep learning,as the basic framework of the emergent against network structures,image compression encoder/decoder structure,the implementation of different levels of image compression,as much as possible to retain more important area for the image details,improving the quality of reconstruction image visual,at the same time proposes suitable for the semantic reasoning method of image compression.The main contents and innovative work of this thesis are as follows:(1)For high-resolution images in low bit rates have weak ability to extract typical features of images and causes image distortion problem by insufficient bit allocation in different areas of the image,leading to the poor effect of the reconstruction image quality,this thesis proposes a GAN network image compression method based on multi-scale and attention mechanism.The idea of using the multi-scale feature decomposition to the design of encoder and the discriminator.When the encoder aggregates multi-scale feature information,different weights are set for weighted summation,which improves the ability of feature extraction.Moreover,the attention mechanism of CBAM is added to make the network pay attention to the features of key areas and capture the texture information of the image.At the same time,the generator and multi-scale discriminator are trained to the adversarial loss,and the image is reconstructed from low resolution to high resolution step by step,so as to realize the compression and reconstruction of high-resolution images at the limit bit rate,and generate high-quality images that satisfy people's subjective visual perception.Adam optimizer is used to optimize and update network parameters in an end-to-end way in network training,which effectively avoids the gradient disappearance in network training and plays a certain role in image reconstruction and restoration.The experimental results show that the compression reconstruction performance of this method is good,the reconstruction quality at low bit rate is more popular with users,the image content definition is higher,and people's subjective visual perception is satisfied.The two indexes of PSNR and MS-SSIM are obviously better than JPEG2000 and some CNN-based methods.(2)Aiming at the problem of poor applicability of existing image compression algorithms for semantic tasks,the thesis proposes a multi-task semantic compression method based on multi-scale GAN network.The image is compressed and reconstructed by using generative adversarial network with perceptual quality,and the latent representation of quantizer is put parallel into generator and classifier network.On the one hand,the high quality generator is used to decode the image and reconstruct the image,and the multi-scale discriminant is used for adversarial training to generate high-quality image.On the other hand,the classifier is used for semantic classification of images.The idea of multi-task learning is added to the whole method,the network is optimized through encoder parameter sharing,and useful information is jointly shared to improve semantic discrimination and image compression to promote better results for different tasks.This thesis proposes the combination of mean square error loss and perceptual loss,to preserve the fidelity at the pixel-level and semanticlevel.Through the end-to-end way learning achieve coordinated image compression and classification task,which can not only use fewer parameters to achieve the image processing task,but also can reconstruct high-quality images,and classify images according to their semantic information.The experimental results show that the image quality of the reconstructed images by the proposed method is better.The PSNR and SSIM of image compression indexes can reach 26.49dB and 89.4%at 0.1450bpp by using the ImageNet dataset,and the image classification accuracy can reach 74.5%,which proves the effectiveness of the proposed algorithm.
Keywords/Search Tags:Deep Learning, Generative Adversarial Network, Image Compression, Image Classification, Multi-Scale Method
PDF Full Text Request
Related items