Font Size: a A A

Feature Representation And Image Compression Based On Unsupervised Deep Model

Posted on:2022-02-13Degree:DoctorType:Dissertation
Country:ChinaCandidate:D ZhaoFull Text:PDF
GTID:1488306311992819Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Deep learning has achieved unprecedented success in computer vision in recent years.Currently,most applications are deeply dependent on a large amount of label data,which greatly limits the applicability of deep neural networks.Different from mainstream neural network training methods,a general consensus is that the human brain learns mainly in an unsupervised manner,while supervised information reinforces learning through feedback.Although we do not fully understand how the human brain works,our propose is to explore the way of unsupervised learning like the human brain,which can automatically extracts rich features from unlabeled visual content and improves the self-learning ability of intelligent vision machines.Such unsupervised feature can improve the quality of learning when applied in various visual tasks,and on the other hand,it also helps us better understand the human brain and artificial intelligence.This thesis proposes an unsupervised representation learning based on the sparse decomposition of object instances in video,and applies this model to visual tasks of object discovery and image classification.This thesis also proposes a variable-length image compression algorithm based on unsupervised adversarial learning.The main contributions of this thesis are as follows:1.An unsupervised representation learning based on the sparse decomposition of object instances in video is proposed,named UnsupV.Currently,most unsupervised models take static images as training data.However,the morphological or spatial variations of the object in the time dimension in the video contain more abundant training information.On the other hand,we assume that most of the object instances can be sparsely represented in some feature space,so that instance-to-image reconstruction and multi-level unsupervised feature learning can be realized.UnsupV model takes videos as the learning source.By using neural networks to sparse decompose the object instances in the video,the model can learn to distinguish different instances and extract instance features without any labels.This method is trained from video but can directly extract unsupervised features from individual images.For validation experiment,this thesis considers a relatively simple scenario in which each image is roughly composed of a foreground and a background.Based on the encoder-decoder structure,the foreground,background,and segmentation mask are sparsely represented respectively,and the model is trained end-to-end by reconstructing the original image.The experimental results on the large-scale dataset YouTube Objects show that the model UnsupV can separate the foreground from background and accurately detect the objects without any supervision,which verifies the ability of the model UnsupV to extract the high-level visual features.2.The unsupervised model UnsupV enhances the visual tasks of object discovery and image classification.In UnsupV-based object discovery,the object segmentation can be quickly obtained by applying the standard feed-forward processing,reducing the computation time compared with the traditional algorithm.The experimental results of the datasets Object Discovery,MSRC and iCoseg show that the model UnsupV can still obtain high-quality segmentations in images of unseen Classes,which solves the dependence of traditional algorithms on the collection of images containing objects of the same class and verifies the generalization ability of the model UnsupV.In UnsupV-based image classification,three methods are proposed to evaluate the enhancement ability of UnsupV for the classification,including the fusion of original image and segmentation,the classification performance of feature representation,and the limitation of training label.The experimental results on the Cifar10 dataset show that the model UnsupV can help to improve the classification accuracy and reduce the dependence on label data.3.A variable-length image compression algorithm based on unsupervised adversarial learning is proposed.Currently,the neural-network-based compression algorithm mainly adopts the mode of fixed input length and fixed output length,resulting insufficient compression for some low-information images and big distortion for some high-information images.To tackle this problem,this thesis proposes a compression algorithm that can realize variable-length coding by training a single network.Firstly,the model combines an auto-encoder and a generative adversarial network for generative compression.Secondly,a noise interference mechanism is proposed to make the feature nodes distributed from top to bottom according to their importance in feature expression.Based on this importance distribution,the variable-length compression can be achieved by discarding bits of those less-important feature nodes to meet the compression target.The experimental results of the datasets UT Zappos50K and CelebA-HQ show that the proposed compression algorithm can not only achieve variable-length compression but also can recover high-quality compressed images at extremely low bit rates,outperforming the traditional JPEG and JPEG2000 algorithms.
Keywords/Search Tags:deep learning, computer vision, unsupervised learning, unsupervised feature representation, object discovery, image classification, variable-length compression, adversarial learning
PDF Full Text Request
Related items