Font Size: a A A

Research On Multi-attention Mechanism Fusion For Fine-grained Image Classification

Posted on:2022-08-23Degree:MasterType:Thesis
Country:ChinaCandidate:H Y LiFull Text:PDF
GTID:2518306500465434Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In recent years,with the development of graphics processor,the parallel computing ability of the machine has been improved,which promotes the rapid development of deep learning.In the domain of computer vision,with the above progress,relevant researchers are committed to introducing deep convolution model into traditional image classification tasks,and the classification results have approached or even surpassed human performance,making a breakthrough.Recently,the task of image classification is developing towards a fine-grained level,and the task of fine-grained image classification(FGIC)has become a highlight domain.Rather than traditional image classification,due to the influence of image shooting scene,FGIC has the difficulties of small difference between classes and large difference within classes.Attention focus mechanism has been widely used in FGIC,but the traditional attention focus mechanism has the characteristics of location first and then processing,the model needs to run step by step,and the attention focus method is single.In order to further improve the performance of deep convolution neural network in FGIC task,this thesis studies the end-to-end weakly supervised FGIC model based on multi attention mechanism fusion.In this thesis,we construct four kinds of attention focusing mechanisms embedded in the deep convolution FGIC network,including class activation map attention focusing method(CAM),channel attention focusing method(CA),spatial attention focusing method(SA)and channel spatial confusion attention focusing method(CSCA).On the fine-grained image classification dataset CUB-200-2011,Stanford dogs and Stanford cars,four attention focusing methods are compared and analyzed.The results show that the four attention focusing methods can focus on local features and improve the classification performance of convolution network,and the CSCA focusing method is the most significant.Based on the above research,this thesis further explores a multi attention multi scale learning network model(MAMSL).As a weak supervised learning network,MAMSL does not rely on the annotation information and boundary box information of manual pre annotation;Compared with the traditional deep learning model of location first and then processing,MAMSL can realize end-to-end learning without setting algorithm for specific features,which can effectively improve the generalization ability of the model;MAMSL extracts the focus results of multiple attention of the backbone network,constructs the feature extractor of branch convolution neural network with different depth to obtain the feature representation of different scales,and obtains the model classification data by stacking fusion.The experimental results show that MAMSL model is an end-toend weakly supervised network model with better performance than the fine-grained image classification models proposed by SE-net,CBAM,MACNN and AHMTL in recent years.
Keywords/Search Tags:Fine-grained Image Classification, Convolutional Neural Network, Attention Proposal, Multiscale Learning, Weakly Supervision
PDF Full Text Request
Related items