Font Size: a A A

Research On Stable Image Classification Method Based On Deep Network

Posted on:2023-12-14Degree:MasterType:Thesis
Country:ChinaCandidate:H ShaoFull Text:PDF
GTID:2558307073482964Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In recent years,with the rapid development of deep learning,the current artificial intelligence has achieved relatively ideal performance in tasks such as image classification and object detection,and artificial intelligence technology has also been widely used in people’s lives.Machine learning methods usually require the assumption that the test data and the training data are independent and identically distributed.However,in real-world applications,this assumption is difficult to satisfy.Therefore,how to make the model maintain the stability of the prediction on the data sets with biased distribution has become a problem in both academia and industry.Image classification is one of the most important tasks in the field of computer vision.Therefore,the study of stable image classification methods has important academic and social application value.To this end,from the perspectives of feature disentangle and stable representation learning,this thesis conducts research on stable image classification methods based on frontier technologies such as contrastive learning,frequency domain learning,and variational autoencoders.The research in this thesis mainly focuses on two non-I.I.D.scenarios for stable image classification tasks: non-I.I.D.image classification and compositional zero-shot learning.Aiming at the problem of non-I.I.D.image classification that simulates different degrees of distribution shift between the training set and the test set,this thesis proposes a model based on contrastive learning in the frequency domain to learn the stable features in the image,so as to overcome the distribution shift between training set and test set data.The learning of the model can be divided into two steps: contrastive learning pre-training in the ferequency domain and fine-tuning for image classification.In the pre-training stage,the Discrete Cosine Transform(DCT)is performed on the image first to extract the frequency domain features of the image,and then the contrastive loss between anchor samples,positive samples and negative samples is minimized to make the model learn the stable features.Subsequently,the classification ability of the model is trained by fine-tuning the model parameters through the crossentropy loss function of classification.The experimental results on the public dataset NICO show that the model proposed in this thesis has achieved the state-of-the-art,and extensive experiments have proved the effectiveness of the method.In the compositional zero-shot learning task,the image is composed of attributes and objects.The composition of <attribute,object> in the test set has never appeared in the training set.Compared with the non-I.I.D.image classification task,the distribution shift between the training and test sets is harsher and more challenging.From the perspective of feature disentangle,this thesis proposes to extract the disentangled features of attributes and objects through the variational autoencoder,and then calculate the classification scores with the disentangled features of image features and the label features extracted by the graph neural network to obtain the classification results.The model proposed in this thesis achieves state-of-the-art results on two public datasets MIT-States and Ut-Zappos,demonstrating the effectiveness of the method.
Keywords/Search Tags:Non-I.I.D.Image Classification, Compositional Zero-Shot Learning, Frequency Domain Learning, Contrastive Learning, Variational Autoencoder, Graph Neural Networks
PDF Full Text Request
Related items