Font Size: a A A

Research On Fine-Grained Image Classification Based On Deep Learning

Posted on:2023-10-11Degree:MasterType:Thesis
Country:ChinaCandidate:Q YangFull Text:PDF
GTID:2568306794955399Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Fine-grained image classification is the task of dividing subclasses under a single category of images,and the small gap between classes dictates that the task requires attention to more detailed image features.Most previous work has been devoted to localizing discriminative regions using a weakly supervised approach and using local features for classification.However,there are problems such as insufficient localization for local regions and underutilization of multi-grained local features.To address these problems,this paper designs a series of weakly supervised fine-grained image classification networks using deep learning techniques,as follows.(1)Multi-grained feature fusion net,This method uses iterative learning to gradually adjust the different feature extraction stages of the backbone network,and the different feature extraction stages correspond to different granularities through the input images of different granularities generated by the part dislocation module.Realizing the effect of extracting multiple granularities of features with one backbone network.At the same time,iterative learning is used to pass empirical information layer by layer,thus mining information of complementary granularity.The attention module is introduced for refinement feature screening,and to make full use of the multi-granularity local feature complementarity,the variable convolution module maps the screened multi-granularity features to the same feature space and fuses them,and finally the classifier uses the fused features for final classification.(2)Local feature feedback transformer,In order to extend the perceptual field of the vision transformer to adapt to the task of fine-grained image classification.This method proposes a part selection module,which uses the self-attention weight map generated during the training of the network to measure the importance of patch tokens without introducing additional parameters,to locate the most discriminative image patch tokens as local features,and later,the image of the corresponding region is enlarged and re-fed to the network as input through the local feedback channel.And the overall network is allowed to adjust the parameters according to the characteristics of the local features to achieve the purpose of adapting the network to the overall image features and local image features,thus improving the classification effect of the network.(3)Feature select and fusion transformer,This method proposes a cross-axis attention module to measure the similarity between each patch token and the classification token,and selects the patche tokens with high similarity as classification token,repeats this step in each layer of the transformer encoder excluding the last layer to extract enough discriminative patch tokens.Meanwhile,introduces a feature fusion module to aggregate the category encoding and all the discriminative patche tokens as fused features,and then sends them to the last layer of the transformer encoder.The classification token learns the local features and low-level features from the aggregated features,and makes the network more sensitive to the multi-grained features.In summary,this paper mainly uses attention mechanisms to improve the localization ability of fine-grained classification networks for discriminative features,and uses different feature fusion methods to enhance the sensitivity of the networks to multi-granularity features.The effectiveness of the three network models proposed in this paper is experimentally verified on commonly used fine-grained image classification datasets.
Keywords/Search Tags:Fine-grained image classification, Attention mechanisms, Multi-granularity features, Feature fusion
PDF Full Text Request
Related items