Some Optimization Methods For Deep Convolutional Neural Network

Posted on:2020-09-28

Degree:Master

Type:Thesis

Country:China

Candidate:Z Li

Full Text:PDF

GTID:2428330620956385

Subject:Computational Mathematics

Abstract/Summary:

PDF Full Text Request

Deep Convolutional Neural Networks can essentially be seen as an optimization model.Designing reasonable models and solving algorithms are the key to optimization problems.This paper attempts to optimize the deep convolutional neural network method from the aspects of data selection and network structure design.For the optimization of data selection,two schemes are considered based on gradient and active incremental learning.The gradient selection optimization scheme mainly focuses on the algorithm level,and measures the importance of the data by the gradient magnitude of the pre-trained model backward from each data.Active incremental learning focuses on designing selection criteria from model prediction results,evaluating the accuracy of candidate samples by entropy,evaluating the stability of samples by divergence information,and then weighting the two items to obtain the final selection index.Experiments show that the gradient-based selection scheme can not only improve the performance of models stably,but also can discover the redundancy in the data.The active selection scheme can achieve considerable training results when using half of the data compared to random screening.For the optimization of network structure,the multi-path skip connection structure in classification problem and the multi-scale loss function structure in semantic segmentation problem are considered.For the experimental results,the skip layer connection structure is similar to the Ensemble Learning scheme.A large network(strong classifier)with better performance is obtained by integrating multiple small network paths(weak classifiers)of different lengths.This multi-path analysis method provides a basis for gradient correlation analysis with the assumption that activation neurons is half of the total neurons.It can be proved that the skip connection with batch normalization can make the gradient correlation between layers changes decay from exponential to sublinear,which is beneficial to the stability of the overall structure of the network.Considering that different layers of the deep network usually learn different characteristics,a multi-scale loss function is proposed to evaluate the effect of feature extraction at each scale.It is shown that this deep multi-scale scheme is similar to the segmentation idea of the maximum posterior model based on Markov Random Field.On the other hand,it is found that the V-cycle scheme in multi-grid method and Unet scheme in semantic segmentation are similar in structure.This paper compares V-cycle and simplified Unet structure in detail.The reason of why deep learning works is not fully explained now.Most deep learning structures achieve good results only when a large number of artificial adjustments have been done.This paper studies the deep network optimization method from the aspects of data selection and network structure design.I hope it can provide some ideas to explain deep learning from the theoretical aspect.

Keywords/Search Tags:

Deep Learning, Gradient Optimization, Active Learning, Multi-path, Multi-scale, Multi-task, Multi-grid

PDF Full Text Request

Related items

1	Research On Image Super-Resolution Reconstruction Method Based On Deep Learning
2	Deep Reinforcing Learning With Self-introspection Mechanism On Multi-scale Reward
3	Research And Implementation Of Intelligent Optimization Technology For E-commerce Advertising Content Based On Multi-task Deep Learning
4	Research On Multi-task Active Learning Algorithm Based On Sample Distribution Structure
5	On Multi-task Learning Methods
6	Research On Prediction And Decision-making Methods Based On Multi-source Information Fusion
7	Research On Multi-Robot Task Assignment Method Based On Reinforcement Learning
8	Research And Application Of Deep Learning And Weak Supervision For Multi-Label Image Classification
9	Application Research Of Multi-View Multi-Task Classification/Clustering Algorithms For Large Scale Data
10	Multi-task Learning Based Classification Algorithm For Balanced Image Data And Unbalanced Temporal Data