Font Size: a A A

Some Optimization Methods For Deep Convolutional Neural Network

Posted on:2020-09-28Degree:MasterType:Thesis
Country:ChinaCandidate:Z LiFull Text:PDF
GTID:2428330620956385Subject:Computational Mathematics
Abstract/Summary:PDF Full Text Request
Deep Convolutional Neural Networks can essentially be seen as an optimization model.Designing reasonable models and solving algorithms are the key to optimization problems.This paper attempts to optimize the deep convolutional neural network method from the aspects of data selection and network structure design.For the optimization of data selection,two schemes are considered based on gradient and active incremental learning.The gradient selection optimization scheme mainly focuses on the algorithm level,and measures the importance of the data by the gradient magnitude of the pre-trained model backward from each data.Active incremental learning focuses on designing selection criteria from model prediction results,evaluating the accuracy of candidate samples by entropy,evaluating the stability of samples by divergence information,and then weighting the two items to obtain the final selection index.Experiments show that the gradient-based selection scheme can not only improve the performance of models stably,but also can discover the redundancy in the data.The active selection scheme can achieve considerable training results when using half of the data compared to random screening.For the optimization of network structure,the multi-path skip connection structure in classification problem and the multi-scale loss function structure in semantic segmentation problem are considered.For the experimental results,the skip layer connection structure is similar to the Ensemble Learning scheme.A large network(strong classifier)with better performance is obtained by integrating multiple small network paths(weak classifiers)of different lengths.This multi-path analysis method provides a basis for gradient correlation analysis with the assumption that activation neurons is half of the total neurons.It can be proved that the skip connection with batch normalization can make the gradient correlation between layers changes decay from exponential to sublinear,which is beneficial to the stability of the overall structure of the network.Considering that different layers of the deep network usually learn different characteristics,a multi-scale loss function is proposed to evaluate the effect of feature extraction at each scale.It is shown that this deep multi-scale scheme is similar to the segmentation idea of the maximum posterior model based on Markov Random Field.On the other hand,it is found that the V-cycle scheme in multi-grid method and Unet scheme in semantic segmentation are similar in structure.This paper compares V-cycle and simplified Unet structure in detail.The reason of why deep learning works is not fully explained now.Most deep learning structures achieve good results only when a large number of artificial adjustments have been done.This paper studies the deep network optimization method from the aspects of data selection and network structure design.I hope it can provide some ideas to explain deep learning from the theoretical aspect.
Keywords/Search Tags:Deep Learning, Gradient Optimization, Active Learning, Multi-path, Multi-scale, Multi-task, Multi-grid
PDF Full Text Request
Related items