Font Size: a A A

Research On Depth Visual Attention Method For Multi Class Target Fine-Grained Recognition

Posted on:2022-08-05Degree:MasterType:Thesis
Country:ChinaCandidate:Z P WangFull Text:PDF
GTID:2518306524460214Subject:Electronic Science and Technology
Abstract/Summary:PDF Full Text Request
The main task of fine grained image recognition is to distinguish different subclasses in the same class,such as birds,vehicles and pests.The difficulty of this task is that the fine grained image dataset has little difference among classes and large difference within classes.Mainstream fine-grained identification algorithms can be broadly divided into two categories: strongly supervised learning method requiring manual labeling boxes and weakly supervised learning method requiring only category labels.However,it takes a lot of manpower and material resources to get the manual boxes,and the discriminant areas represented by manual boxes are subjective.These two reasons lead to the low practicability of the strong supervised learning method.To this end,this paper focuses on the weak supervised learning algorithm,and the main research contents are as follows:(1)The saliency map can directly reflect the location of the discriminant region in the image,it can play a guiding role in adjusting the network structure and parameters.In order to improve the location of the discriminant region,the saliency map of the last convolutional layer is obtained by using the CAM algorithm in this paper.Under the guidance of saliency map,the original input image is augmented to improve the ability of discriminant feature extraction and generalization of the network.In this paper,Inception V3 was used as the feature extractor and WS-DAN algorithm was used as the classification network to verify the method on CUB-200-2011 bird dataset.Experiments show that the method is feasible and effective.(2)In the fine-grained image dataset,most of the classes are not difficult to distinguish,and what can really bring negative impacts on the network classification effect is the extremely similar target classes.Therefore,this paper proposes the Top-K loss function,focusing on extremely similar categories,which has a large gradient and can accelerate the convergence of the network.Top-K loss function is a plug-and-play algorithm and can be applied to various classification networks.In addition,this paper takes weakly supervised attention learning in WS-DAN algorithm as the backbone,replaces the cross entropy loss function of WS-DAN algorithm with TOP-K,and carries out experiments on CUB-200-2011 bird dataset,and the accuracy rate is improved by0.8%.(3)Combining the data augmentations in(1)with the loss functions in(2),our algorithm achieves performance improvement on three public datasets: CUB-200-2011,FGVC-Aircraft,and Stanford Cars.
Keywords/Search Tags:Convolutional Neural Network, Fine-grained Image Classification, Attention Mechanism, Saliency Map, Top-K Loss Function
PDF Full Text Request
Related items