Font Size: a A A

Study Of Fine-grained Image Classification Algorithm

Posted on:2022-09-10Degree:MasterType:Thesis
Country:ChinaCandidate:W C NingFull Text:PDF
GTID:2518306554470894Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Image classification is a widely used technology in human's manufacture and life.However,it is necessary to classify the objects with very similar appearance in many application scenarios,which is a hard task for traditional image classification technology,therefore,fine-grained image classification becomes an important research direction in the field of image classification.There is also a broad range of scenarios that require fine-grained image classification,such as species classification in natural preservation zone,product identification in unmanned supermarket and vehicle identification in traffic crossroads.Due to the “small inter-class variations and large intra-class variations” problem,fine-grained image classification is a challenging task,and it is unable to meet the needs of practical application yet.Based on the research state of fine-grained image classification,and combined with the related theories and methods in this field,this dissertation optimizes the existing fine-grained image classification methods in two aspects: attention mechanism and data augmentation,which enhances the image classification model's capability of locating discriminative regions and extracting discriminative features,then achieves good performance in finegrained image classification task.The main research contents of this dissertation includes the following two aspects:(1)A fine-grained image classification method based on attention mechanism.Attention mechanism is an effective method to locate discriminative regions in images,which includes channel attention and spatial attention in convolutional neural network.Global Average Pooling is often used to extracting global information in channel attention,but Global Average Pooling will lose some parts of the channel information.This dissertation designs a better global information representation method by combining different frequency components of a channel,which results in a more reasonable channel weights allocation.As to spatial attention,a self-attention mechanism with position information embedding is used in this dissertation to extract global spatial information,which enhances the expressive power of the features extracted by spatial attention module.This dissertation employs both channel attention and spatial attention,and achieves state-of-the-art result of 94.7% accuracy on Stanford Cars dataset.(2)A fine-grained image classification method based on data augmentation.Due to the difficulties of samples collection and samples annotation,fine-grained image datasets are probably encounter the problem of insufficient training data.Therefore,it is necessary to use data augmentation methods to supplement training data.However,traditional data augmentation methods fail to take into account the preference of image classification model for data,which limits the augmented samples improving the model's classification performance.In this dissertation,attention mechanism is employed to locate the discriminative regions,and data is augmented based on the discriminative regions,so that the augmented samples can make full use of the information of discriminative regions,and the image classification model can fully learn the features of the discriminative regions.This method achieves state-of-the-art results of 95.5% and 93.4% accuracy on fine-grained image datasets: Stanford Cars and FGVC Aircraft,respectively.A series of ablation experiments are conducted to verified the effectiveness the methods proposed in the above research contents.
Keywords/Search Tags:Fine-grained image classification, Attention mechanism, Data augmentation, Discriminative feature
PDF Full Text Request
Related items