Font Size: a A A

Research On Structure Optimization Of Deep Classification Network

Posted on:2021-01-24Degree:DoctorType:Dissertation
Country:ChinaCandidate:X P ZhangFull Text:PDF
GTID:1488306464957349Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With deep learning playing an increasingly important role in image classification tasks with its influential feature learning capabilities and representation capabilities,a great number of deep classification networks have been put forward.Deep classification networks are deeply affected by datasets and network structure,and it has the problem of consuming massive computing supplies.Therefore,the structural optimization of deep classification networks is very significant work.Improving the performance of deep networks and reducing the consumption of computing resources to design lighter module has become a research hotspot that scholars have paid attention to in recent years.At present,the deep classification network is faced with the following research problems.1)How to alleviate the depth and width redundancy in the deep module;2)How to obtain the attention information of the feature map losslessly;3)How to solve the problem of information loss and gradient continuity interruption caused by global average pooling operation.According to these three problems,this thesis optimizes the existing network structure and proposes corresponding solutions to improve the classification performance of the deep classification network.The main research work and innovations of this thesis are as follows:(1)In the optimization research of the depth and width structure of the network,this thesis amalgamates the design ideas of residual connection,dense connection,and Google network to propose the Cross-Residual-Inception module(Abbreviated as CRI).Our CRI module uses two cross-residual connections on each branch of the width structure,which is different from the single residual connection of Inception-Residual modules.This design effectively transfers the feature information of the front layer on each sub-network to the back layer,avoiding excessive redundancy problems caused by multiple residual connections.Different convolution kernel configurations are adopted on different branches to allow each branch to learn feature information under more rich receptive fields and to enhance the richness of width structure feature learning.On the two classic image classification datasets,the CRI network achieves the same accuracy with less depth and parameters.(2)In the optimization research of the channel domain attention mechanism,this thesis proposes a depthwise squeeze and refinement module(Abbreviated as DSR).This module uses the lossless global information on each feature map to realize the recalibration of feature maps based on the independence rather than dependence between the feature maps and provides a new research idea for studying the channel domain attention mechanism.Aiming at the problem of feature information loss caused by the global pooling in attention mechanisms such as SENet,this thesis proposes to use depth-wise convolution instead of global pooling to obtain feature map information.The DSR module can enhance the feature maps containing more discriminative information and restrain the ones with less information,thereby effectively improves the performance of the deep classification network.To reduce the parameters and to obtain the global information of each feature map quickly,this module uses two depthwise convolutional layers.The DSR module has the advantage of simple code implementation and can be applied to many types of deep classification networks.On multiple datasets,the deep classification network using the DSR method has achieved better performance improvements than the attention mechanisms such as SENet and CBAM.This thesis designed multiple ablation experiments to analyze the influence of the number of convolutional layers,refining function,feature map independence,and module location on the DSR module.(3)In the optimization research of global pooling,this thesis proposes a global learnable pooling method(Abbreviated as GLPool)to enhance the contribution of distinguishing high-dimensional semantic information and ensure the continuity of gradient transmission.Different from the averaging operation of the global average pooling to the gradient,the GLPool method highlights the gradient of the salient feature area containing more distinguishing information,thereby improving the classification performance of the deep network.And the GLPool method has the advantages of adapting to the network size and having fewer parameters.Regarding the location and size of the pooling,this thesis proposes and verifies the hypothesis.The larger the pooling area and the closer to the output end of the network,the greater the impact of the pooling layer on the performance.To show that the GLPool method can enhance the contribution of distinguishing feature regions,this thesis uses the Class Activation Map(CAM)to visualize the learning results of the deep classification network.On the Image Net32 and CIFAR100 datasets,the GLPool method significantly improves the performance of Res Net,Goog Le Net,Shuffle Net,and other networks.
Keywords/Search Tags:Deep learning, Image classification, Residual, Attention, Distinguishing features
PDF Full Text Request
Related items