Font Size: a A A

Research On Fine-Grained Object Classification Method Based On Attention Mechanism

Posted on:2020-10-10Degree:MasterType:Thesis
Country:ChinaCandidate:X LiFull Text:PDF
GTID:2428330599455406Subject:Engineering
Abstract/Summary:PDF Full Text Request
Fine-grained image classification is a sub-task of image classification.It is a hot topic in the fields of computer vision and pattern recognition.Its purpose is to distinguish different sub-categories of the same species.Different from the traditional image classification,different sub-categories of the same species have the characteristics of sub-class differences and large intra-class differences,making the fine-grained image classification task more challenging.Nowadays,people have a growing demand for more precise classification of species,but it is difficult for ordinary people to identify different sub-categories.Dependent domain experts are not only slow but also costly,which has led to extensive research in the academic community,making it possible to use computers.Visual technology automatically classifies and retrieves massively fine-grained images,and completes low-cost fine-grained image recognition,which becomes a very valuable research topic.When classifying fine-grained images,detailed features that are sufficiently distinguishable between different sub-categories are often included in tiny local regions,which are also referred to as significant local regions.Therefore,how to accurately locate the local area of image saliency becomes the focus and difficulty of fine-grained image classification research.The existing fine-grained image classification algorithm mainly relies on target detection or manual labeling information to achieve local area localization.On the one hand,the acquisition cost of manual annotation information is very expensive,which restricts its practicability;on the other hand,due to the local area of positioning Contains more redundant information,does not make full use of the target information,and such algorithms ignore the correlation between image channels,affecting the performance of the classification.Aiming at the above two problems of the current classification algorithm,this paper introduces a attention-based attention-based attention network in the last layer of the deep convolutional neural network.The attention mechanism simulates the human visual attention characteristics,that is,through a lot of training.Sample self-learning can locate significant local areas and achieve the purpose of making full use of the classification target information.In the training process without relying on the manual labeling box,the attention network of this paper fits a function by autonomous learning,assigns different weights to the channel feature map outputted by the last layer of convolutional layer,and re-corrects the channel feature map according to the weight.The size of the value locates the significant channel,which makes full use of the target channel information to suppress the interference of the useless channel information to the classification.However,after the introduction of the attention network,the model's feature representation ability is still insufficient,resulting in poor classification results.Inspired by the bilinear CNN(B-CNN),the bilinear pooling operation of the recalibrated channel feature map is performed on the basis of the introduction of the attention network,and the mutual characteristics of the local features of different channels are considered.The relationship improves the representation ability of the feature,thereby optimizing the positioning effect of the significant channel,thereby more accurately utilizing the target channel information,and improving the classification ability of the network.The effectiveness of the proposed algorithm is verified by multiple sets of comparison experiments on multiple published fine-grained image datasets.Experiments show that compared with the benchmark network,the accuracy of the proposed algorithm in the data set CUB-200 is increased by 1.26%,which achieves a classification accuracy of 85.26%.Compared with the benchmark network that uses the Pytorch framework to reproduce the number of training rounds.The classification accuracy rates on the CUB-200,FGVC-aircraf and Car-196 databases were increased by 0.26%,0.1%,and 0.46%,respectively.
Keywords/Search Tags:fine-grained image classification, convolutional neural network, bilinear CNN, Attention mechanism
PDF Full Text Request
Related items