Font Size: a A A

Research On Image Classification Algorithm Based On Contextual Discriminative Feature Fusion

Posted on:2021-01-22Degree:MasterType:Thesis
Country:ChinaCandidate:G H QinFull Text:PDF
GTID:2438330623471704Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In the background of the current era of big data,with the generation of large data sets and the rapid development of computer computing power,deep learning has gradually risen in various fields.As the main method of deep learning,convolutional neural networks have reached or even surpassed the professional level in challenging visual tasks such as tumor recognition,vehicle recognition,face recognition,and have gradually become the mainstream method for image classification tasks.Although deep learning has achieved great success,because the deep learning model itself is composed of complex multilayer non-linear structures and contains tens of thousands of parameter calculations,researchers cannot intuitively understand the working status of the model and explain the working principle of the model.And only relying on the experience of algorithm designers to optimize the model structure can't find out the problem quickly,which has undoubtedly become a bottleneck restricting the development of convolutional neural networks.From the perspective of deep learning interpretability,this paper uses convolutional neural network visualization technology to observe the current mainstream convolutional neural network ResNet50.The problems of ignoring important contextual information and insufficient attention to target secondary feature areas are discussed.Aiming at these problems,this paper proposes a novel convolutional neural network model architecture,which has better performance than the current mainstream CNN models.;and based on this,an upgraded version with half the parameters and better performance is proposed;in the process of exploring the structure of the convolutional neural network model Some advanced training strategies are skillfully combined for ablation experiments,and a series of general training methods are summarized to improve the performance of convolutional neural networks in image classification tasks.The main work of this article is as follows:(1)Ablation experiments were performed on a series of advanced training strategies.The experiments proved that without adjusting the model and the calculation amount has not increased significantly,after adjusting to the optimal batch size value,the learning rate strategies Cosine or HTD,and Advocacy data augmentation strategies Cutout,Mixup and other methods can significantly improvethe performance of most CNNs,and combining them can further improve the accuracy of the model.For example,using the entire training strategy can improve the accuracy of VGG19 and ResNet110 on the CIFAR100 dataset by 3.09% and 4.88%.(2)Aiming at the problems of current mainstream models ignoring important context information and insufficient attention to target secondary feature regions,a multi-scale feature fusion residual model MSResNet.This model uses the residual structure to solve the degradation problem when the network deepens.At the same time,the model's receptive field is strengthened by multi-scale features of fusion grouping convolution and hole grouping convolution,thereby obtaining more context information.It also improves network performance,and enhances the model's ability to capture secondary feature areas by combining data augmentation strategies of Cutout and Mixup.Through experimental verification,the accuracy of MSResNet on the CIFAR-10 and CIFAR-100 datasets reached 96.84% and 84.42%,surpassing advanced image classification models such as VGG,ResNet,DenseNet,and ResNext.And through classification activation maps comparison,it can be found that MSResNet solves the problem that the current model ignores important context information and insufficient attention to secondary feature areas.(3)Although MSResNet has strong performance in image classification tasks,it also brings lots of parameters.Therefore,based on MSResNet,this paper proposes a residual network CA-MSResNet based on channel attention mechanism and multi-scale fusion.And proposes a new channel attention architecture CA module,which suppresses non-classified category pixels.Experiments show that CA-MSResNet not only has half of the MSResNet parameters,but also solves the problem of large MSResNet parameters.At the same time,CA-MSResNet improves the accuracy of CIFAR10 and CIFAR100 datasets by 0.2% and 0.49% compared to MSResNet.And through classification activation maps observation,we can find that CA-MSResNet activates the target subject more fully than MSResNet.
Keywords/Search Tags:Deep learning, Image classification, Classification activation map, Multi-scale feature fusion, Data augmentation, Channel attention
PDF Full Text Request
Related items