| In recent years,with the rapid development of computer technology and the general estabilishment of the big data platform,many image data appears in real life,which has been widely used in several computer vision task scenes.As one of the research hotspots,fine-grained image classification(FGIC)has high application value in product quality detection,biodiversity detection,intelligent retail,intelligent transportation and other fields.This task focuses on the division of subcategories under the same parent class.The main challenges of FGIC are small inter-class variance and large intra-class variance,which severely affect the performance.With the appearance of convolutional neural network(CNN),FGIC has made great progress and development.Existing fine-grained image classification methods are mainly based on localization subnetwork or deep feature coding.The improvement of classification performance is always accompanied by the improvement of model complexity,and additional annotation information is required to constrain model optimization.These disadvantages limit the deployment and application of FGIC model in production and life.To solve the above problems,this thesis proposes an attention mechanism based on gradient guidance,which makes the network pay attention to the parts that contribute to the classification effect,and at the same time learns the differences between the subclasses of the same parent class.Hence,the classification performance of the model can be improved.In this paper,a mixed attention mechanism model for FGIC is proposed.Based on the backbone neural network,channel attention module and region attention module are constructed successively to emphasize the discriminant information in channel and region,while suppressing the influence of invalid parts.Secondly,in order to solve the lack of effective supervision in the current mainstream attention model,we put forward a method based on visualization of gradient to guide the attention module,which makes the response information obtained from target category backpropagation consistent with the information of the weight of the attention mechanism,so that the components which the network focuses on match the components which contributes greatly to the classification.Finally,faced with the problem that subclasses of the same superclass are prone to misclassification widely existing in the current finegrained image classification model,we introduce multi-level classification label information in fine-grained datasets,design the loss function to optimize the network gradient updating process and promote the network to learn feature differences between subclasses of the same superclass,and expand the spatial angle interval of the corresponding feature vectors between fine classes.The proposed method does not introduce additional labeling frame information and can carry out end-to-end training.Moreover,good results are obtained on three general FGIC sets,which demonstrates the effectiveness of the method. |