Font Size: a A A

Research On Image Classification Algorithm Based On ViT

Posted on:2024-01-31Degree:MasterType:Thesis
Country:ChinaCandidate:M H HuangFull Text:PDF
GTID:2568307118951109Subject:Electronic information
Abstract/Summary:PDF Full Text Request
With the development of deep learning technology,image classification technology based on convolutional neural network has become mature.Transformer network introduced by natural language processing provides a new technical route for the development of computer vision.Vision Transformer model introduced in order to retain the features of the original Transformer model as far as possible has certain defects in its structure in processing image information,such as the simple and rough image block operation will lose part of the image information,which is not conducive to the learning of image features.In this thesis,we study deep learning technology,analyze ViT network and convolutional neural network,and study image classification based on this network.The main work is as follows:1.Through the selection of different network models and comparative experiments,the different influences of transfer learning methods on the training effect are analyzed.A series of comparative experiments are designed based on convolutional neural network and Transformer class network,and are trained on Image Net subset flower classification data set and Food-101 data set.The pre-training weights on Image Net were used for each network.The classification effects of different networks in different classification tasks were analyzed,and the training effects of different networks without pre-training weights were compared.The experiment showed that the transfer learning method greatly improved the training effect.2.Based on the method of data enhancement,the change of the training effect of the network under the condition of noise interference and extended data set is studied.Random noise and disturbance are introduced into the original flower classification data set for data enhancement processing,and training analysis is carried out on the new data set based on convolutional neural network and Transformer class network.Experiments show that Transformer network has stronger robustness to local noise disturbance,self-attention mechanism calculation method is more advantageous than convolution calculation method,while ViT’s image block operation is not conducive to complex image feature extraction.3.An improved HFE-ViT network model is designed,and an image classification algorithm based on the network model is given.Training and simulation experiments are carried out on flower Aug dataset and Food-101 dataset.In this network model,the hierarchical feature extraction structure replaces the image segmentation operation of ViT,and the simulation experiment shows that the hierarchical feature extraction structure improves the training effect of ViT,and the accuracy of both training dataset and testing dataset is improved.
Keywords/Search Tags:Image recognition, Image classification, Vision Transformer, Transfer Learning, Data Augmentation
PDF Full Text Request
Related items