Font Size: a A A

Research On Model Optimization Of Deep Convolutional Neural Networks

Posted on:2021-11-10Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y LuFull Text:PDF
GTID:1488306569986869Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In recent years,deep learning models have been widely applied in many intelligent fields due to their outstanding learning abilities.However,with the continuous expansion of intelligent technology revolution,the intelligent applications of deep learning models in complex scenarios are faced with two main challenges.(1)Poor generalization of deep learning models.The deep learning models of task-specific training cannot effectively cope with the complex and varied scenarios and test samples with large distribution differences generated by end-users,which directly leads to the decline of the output performances of the models.(2)Large model size and inefficiency in deep learning models.Due to the limited storage and computing capacity of computing devices in practical application scenarios,there is a contradiction with the complexity of the frameworks and the huge demand for computing and storage in deep learning models.This paper studies and develops the theories of generalization and efficient compression in deep convolutional neural networks to put forward the corresponding methods of network's regularization and compression to optimize the models,accomplishing the research goals of improving the networks' generalization,training efficiency and inference efficiency.The specific methods proposed in this thesis are as follows:(1)To address the problem that the regularization methods based on feature augmentation has poor flexibility and pertinence,Auto-Adaptive Multiplicative Noise Regularization Convolutional Neural Networks(AAMN-Nets)is proposed.In the regularization process of this model,the learnable mechanism is employed to improve the flexibility of the regularization method;The multi-scale mechanism is proposed to generate more fine-grained multi-frequency augmented information.The conditional prediction weight mechanism is used to improve the pertinence of the regularization method and generate the augmented data more reasonably.The self-identity mapping mechanism is proposed to improve the universality of the regularization method and make it applicable to different networks.The regularization method can effectively improve the flexibility and pertinence of the regularization process.(2)To deal with the problem that the regularization methods based on feature augmentaion often results in slow convergence in training,Fast-Convergence Additional Noise Regularization Convolutional Neural Networks(FCAN-Nets)is proposed.In the regularization process of this model,the additional noise generated from the learnable noise distribution is used to replace the traditional multiplicative noise,so as to improve the regularization flexibility and reduce the dependence of the augmented data on the sample features,which can decrease the distribution difference between the augmented data and the original data to stabilize the training of the model.Furthermore,the block attention mechanism is proposed to add noise to the local block areas activated by attention in the feature map to reduce the noisy degree of the feature map,which can further stabilize the model's training.Finally,the learning process of the model is stabilized again by the way in which the original feature and noisy feature share convolution and the way in which the predicted weights fuse the output features.Compared with traditional regularization method,this proposed model can improve the training efficiency significantly.(3)To resolve the problems of parameter redundancy in the spatial extent in the compression methods based on group convolution and the low efficiency of spatial convolutional operation implemented by the underlying framework,Super Sparse Convolutional Neural Networks(SSC-Nets)is proposed.The networks is compressed at the spatial extent in the filters and the number of groups is largely reduced through diluting the two-dimensional plane convolutional kernel to the three-dimensional convolutional kernel.This dilution method can extract the spatial geometric features and channel difference information from the feature maps at the same time,so as to ensure the networks' learning ability.Based on this design criteria,each plane convolution kernel in threedimensional convolution kernel has only one nonzero parameter.Therefore,the networks can be completely implemented by the convolutions with spatial size “1 × 1”,which can save the “img2col” and “col2img” operation time in the implementations of the spatial convolutions by the underlying framework,leading to more friendly supported.Therefore,the SSC-Nets can compress the networks more effectively to improve the networks' training and inference efficiencies.(4)To cope with the problems of parameter redundancy at the channel extent in the compression methods based on group convolution and the low efficiency of group convolutional operation implemented by the underlying framework,Sparse Repeated Group Convolutional Neural Networks(SRGC-Nets)is proposed.In this method,the channel extent of group convolution is further decomposed to generate the group extent.The SRGC-Nets decrease the parameter redundancy at the group extent through sharing the same spatial filters in different groups.This can also avoid the utilization of “channel-wise”convolutions to reduce the number of groups,and thus realize the efficient compression of networks theoretically and practically,leading to improving the actual efficiency of networks.In addition,every shared filter can process more information in one training iteration,which can improve the learning ability of the convolutional kernels.Compared with traditional compression methods based on group convolution,the training efficiency and accuracy of SRGC-Nets are much higher.
Keywords/Search Tags:over-fitting of networks, regularization of networks, compression of net-works, group convolution, diluted convolution, reused convolution
PDF Full Text Request
Related items