Font Size: a A A

Research And Implementation Of Fine-grained Image Classification Method For Weakly Supervised Attention

Posted on:2022-07-01Degree:MasterType:Thesis
Country:ChinaCandidate:K K LiFull Text:PDF
GTID:2518306344992719Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In recent years,the rapid development of artificial intelligence technology and deep learning framework has promoted the widespread application of computer vision technology in real life.Traditional image classification can no longer meet the needs of real scenes,and the challenging fine-grained image classification problem has become a hot research direction in the field of computer vision.It is committed to achieving fine division of sub-categories from large categories at coarse-grained levels,but this task has two major problems:large intra-class differences and subtle differences between classes.Early scholars achieved relatively good classification results through strong supervised learning methods that required a large number of manual annotations.However,such methods and models rely excessively on manual labeling of information,which restricts the feasibility of practical application of classification and recognition tasks.Therefore,to comprehensively analyze the current situation and problems of fine-grained image classification tasks,this article focuses on the weakly supervised model method as the main research,this research topic will be elaborated from the following aspects:(1)In order to alleviate the problem of large intra class difference interference,this thesis uses data enhancement methods to expand the data set from multi-dimensional perspectives such as color transformation and geometric transformation;it can reset the pixel spatial distribution of a single sample in the data set.To improve the data sample's divergence caused by external environmental factors,it can effectively improve the generalization ability of the network model.(2)In the fine-grained image classification model,the convolutional network has insufficient feature learning capabilities and information discrete problems,this thesis proposes to introduce a distracting bilinear aggregate residual network,and embed the new network into the original B-CNN.Among them,the aggregate residual convolution block can capture more differentiated information from different subspaces.Then,an improved distraction module is embedded for each aggregate residual block to strengthen the dependence of image information from the two dimensions of space and channel,thereby improving the robustness of the network..(3)Aiming at the problem that the difference between categories is very subtle,this thesis introduces an optimized mutual channel attention module to optimize the extraction of saliency information from the two dimensions of the channel domain and the spatial domain,which can constrain the channel weight distribution and discard redundant features.,Forcing the network to focus on more discriminative areas,and then capture more diverse and highly differentiated information.In this thesis,the improved model is trained by using the transfer learning method based on model fine-tuning,and the verification and analysis are carried out from the two perspectives of ablation experiment and comparative experiment.Experiments show that the model in this thesis is superior to most mainstream networks,and its performance is significantly improved compared to the original model;finally,a fine-grained classification system is developed to further complete the application verification of the model.
Keywords/Search Tags:Fine-grained image classification, saliency feature map, diversified scale information, cross-channel attention module, split attention
PDF Full Text Request
Related items