Font Size: a A A

Research On The Algorithm For Fine-Grained Image Classification

Posted on:2019-08-03Degree:MasterType:Thesis
Country:ChinaCandidate:Y ZhangFull Text:PDF
GTID:2428330566998574Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Fine-grained image classification is the sub-task of image classification,which mainly focus on the different sub-categories of the same category,such as the variety of birds,the type of cars,etc.It has great significance for massive image data retrieval and classification management.Different from the general classification tasks,finegrained image classification has the characteristics of small intra class gap,and it is necessary to distinguish different categories with tiny local detail differences.Therefore,the current fine-grained image classification algorithms firstly generate candidate regions by grouping combination strategy,and then use bounding box labels and spatial constraints to filter them,finally classify by training convolutional neural network(CNN).The existing algorithms don't make good use of the target information,and the candidate regions still contain redundant information,which affects the performance of classification.In this regard,we designed a fine-grained image classification algorithm based on attention mechanism.Attention mechanism reflects the perception difference of human vision system to the surrounding environment,that is,attention will focus on the significant areas of the environment.In this paper,we used convolutional neural network to simulate the human visual attention characteristics,so as to make good use of the information of classification targets,which helps to improve the accuracy of fine grained image classification.The overall framework of our algorithm consists of two parts: the attention network and the classification network,the key point is the design of the attention network.In view of the candidate regions contain redundant information,this paper introduced and improved the attention network,positioned object-level region and the part-level of the objects by extracted from feature maps obtained by the convolutional layer of CNN.In order to optimize the effect of positioning,we added bilinear operation after the last convolutional layer and enhance network performance characteristics,which provided a good foundation for the classification of network.In order to make better use of local features,this paper designed a multi-scale classification network to complete the classification from multiple different levels.According to the attention map which was extracted from the attention network,we used HSV color space model for image segmentation and get the image slices of significant region with object-level and part-level.The bilinear CNN classification models were trained based on the slices.For the object-level and part-level classification networks,this paper used multi-model fusion strategy to fuse two softmax vectors of the networks,and obtained the final classification results though the method.In this paper,a large number of experiments were conducted to verify the validity of the classification framework.Experiments show that the proposed algorithm achieved 91.6% and 85.6% accuracy on Car-196 and CUB-200 which improved by 1.4% and 1.5% compared with the baseline model.We also verified our result on Fish CLEF2017 online evaluation task and achieved the accuracy of 65% which improved by 6% compared with the baseline model.The experimental results show that the proposed algorithm can obviously improve the accuracy of classification for images with high resolution and low ratio of object.
Keywords/Search Tags:fine-grained image classification, convolutional neural network, attention network, bilinear operation, multiscale model fusion
PDF Full Text Request
Related items