Font Size: a A A

Fine-grained Image Classification Algorithm Based On Attention Mechanism And Semantic Data Enhancement

Posted on:2022-05-16Degree:MasterType:Thesis
Country:ChinaCandidate:R TanFull Text:PDF
GTID:2518306539468734Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Fine-grained image classification is a sub-task extension of the basic computer vision task of image classification.In contrast with traditional coarse-grained image classification,it has broader business needs and more practical application scenarios in industry and real life.Due to the complex intra-class information and similar inter-class morphology of fine-grained images,it is difficult to focus on the local key parts of the fine-grained image by using the deep convolutional neural network(DCNN)alone,and the classification accuracy of the model is not ideal.At the same time,the existing fine-grained image datasets with less training data also limit the classification performance of the fine-grained model.Therefore,in response to these two issues,this article will conduct research from the two perspectives of attention mechanism and data enhancement.The following is the main work content of this paper.(1)Aiming at the problem that DCNN feature extraction capabilities are not enough to complete fine-grained image classification tasks,this article introduces three attention mechanism modules Squeeze-and-Excitation(SE Block),convolution block attention module(CBAM)and efficient channel module(ECA)in DCNN,so that DCNN can pay attention to the local features of the image.At the same time,because the embedding method of the attention module has a great impact on the performance of the model,this article proposes three embedding methods of the attention module: serial method,residual method and parallel method.Through comparative experiments and analysis of visualization results,it is concluded that the embedding of CBAM in parallel method can allow DCNN to pay attention to more abundant local information,and bring the best improvement effect for DCNN.Compared with VGG16 and Res Net34,it's performance are increased by 1.98% and1.57% respectively on the CUB-200-2001 data set.The conclusions of this chapter can be easily embedded in any DCNN,with strong versatility and universality.(2)In view of the current lack of fine-grained image training data and traditional data enhancement techniques that are not suitable for fine-grained image classification tasks,this article starts from the perspective of dual-semantic data enhancement,then in the training phase based on Bilinear attention pooling(BAP)we construct a local attention learning modules and a feature difference learning module to obtain two types of data at different semantic levels respectively.Dual semantic data focuses on the local details of fine-grained images and the important contours of the target,which can greatly increase the effective training data of the model,and improve model's accuracy by means of semantic data enhancement.At the same time,in order to enhance the mid-term expression ability of features,the model combines the advantages of the conclusions of the work in Chapter 3,and embeds the convolutional block attention module(CBAM)in parallel in the local attention learning module and the feature difference learning module.Finally,the target positioning module is constructed in the testing phase to make the model focus on the whole classification target,which further improves the classification accuracy.Through comparative experiments and visual analysis,it is concluded that the model in this chapter can improve the classification accuracy of the model through the fine-grained image semantic data obtained,And has obtained 89.5%,93.6% and 94.7% classification accuracy in the data set CUB-200-2011,FGVC Aircraft and Stanford Car respectively,its performance is superior compared with other methods.The work proposed in this article not only shows strong classification performance on the three general data sets,but also provides ideas and theoretical references for future research on related fine-grained image classification algorithms.
Keywords/Search Tags:fine-grained classification, attention mechanism, data enhancement, bilinear network, attention learning, target location
PDF Full Text Request
Related items