Font Size: a A A

Research On Image Coding Based On Octave Convolution

Posted on:2023-06-28Degree:MasterType:Thesis
Country:ChinaCandidate:Z Y LiuFull Text:PDF
GTID:2568306614493884Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of information technology,there are more and more multimedia data on the Internet.As one of the most common multimedia data,digital images always occupy a dominant position in information interaction.However,unprocessed images not only take up a lot of storage space,but also put huge bandwidth pressure on transmission channels.Therefore,it is necessary to compress images reasonably.Image coding based on traditional transformation usually has defects such as artifacts and blurring,whose image quality is not good and cannot meet the needs of the current situation.In recent years,deep learning has achieved remarkable success in the field of computer vision.Image coding methods based on deep learning have shown better performance than traditional codecs,which is an important research direction in the future.In order to reduce feature redundancy and improve the quality of reconstructed images,this thesis uses the convolutional neural networks to compress the image.The main research contents are as follows:(1)An image coding method based on octave convolution and semantic segmentation is proposed.The codec uses octave convolution instead of the vanilla convolution.The octave convolution decomposes the feature representation of the image into high-frequency components and low-frequency components.Through the information sharing of adjacent elements in the neural network,the spatial resolution of low-frequency components can be reduced,which can reduce the spatial redundancy of compressed representation.Furthermore,most generation models are based on pixel-by-pixel comparison and do not utilize semantic information fully.This method uses the semantic segmentation map of the original image as auxiliary information to guide the allocation of bits in the image space,and further improves the accuracy of semantics and texture in the reconstructed image.For the training,generative adversarial networks(GAN)are used to optimize the generative model.Experimental results show that the proposed method can significantly reduce the distortion of reconstructed images,outperform the existing standard image codecs at different bit rates,and has obvious advantages for the reconstruction of images especially for complex textures and semantics.(2)An image coding method based on octave convolution and multi-scale autoencoders is proposed.Using multi-scale autoencoders to encode images at three scales separately can extract features from images at different resolutions.These feature representations can complementarily restore image information,thereby helping the coding framework to compress and reconstruct images from coarse to fine.The codecs are composed of octave convolutional layers to reduce the spatial redundancy of feature representations.The bitstream generated by the multi-scale autoencoders is encoded using a probability distribution-based entropy encoder and the bitrate is estimated for rate-distortion optimization.Furthermore,the model is trained using an adversarial loss to further improve the visual quality of the reconstructed images.Experimental results show that the proposed method is better than traditional standard codecs and some learning-based methods.Especially at low bitrate,the quality of reconstructed images is effectively improved.
Keywords/Search Tags:Image coding, Octave convolution, Semantic segmentation, Generative adversarial networks, Multi-scale autoencoders
PDF Full Text Request
Related items