Font Size: a A A

The Sparse Method Of Convolutional Neural Network

Posted on:2019-10-10Degree:MasterType:Thesis
Country:ChinaCandidate:J FengFull Text:PDF
GTID:2428330593450046Subject:Mathematics
Abstract/Summary:PDF Full Text Request
In recent years,deep learning,or deep neural network(DNN),has been widely used in computer vision,speech recognition and natural language processing.As the popularity of deep learning increases,the DNN approach is favored by various practitioners and artificial intelligent research companies.Due to the large number of the deep neural network model,however,the DNN approach is faced with great difficulties in both computation and storage.As a result,deep neural networks are difficult to be implanted on,for example,some mobile devices.When we are faced with high dimensional data,the usual neural network with standardbatch processing has become inadequate and it calls for techniques to develop neural network with sparse parameters.One approach to maintaining the neural network sparsity is to 'remove'some unimportant parameters after training,and then retrain the remaining parameters.In this way,although the parameter achieves sparse effect in the new model,it is still necessary to train a large neural network to get the sparse network model.This thesis is devoted to the development of a regular dual averaging method based on an online optimization algorithm.The neural network in each training is based on a special backpropagation algorithm that sets the parameter less than a certain threshold to 0,which is the sparsity of neural network parameters in the training process.When the network training is completed,the neural network can achieve the desired sparse effect and the resulting sparse model can be directly used in the subsequent prediction and class tasks.The regular dual averaging method is investigated in this thesis both theoretically and numerically.We first compare the convergence of regular dual average method and stochastic gradient descent method.We theoretically prove these two method have comparable convergence behaviors.The standard stochastic gradient descent algorithm has the advantage of having good precision,while the regular dual average method provide an alternative approach to minimizing the loss function with appropriate regularization and,as a result,it is more effective to increase the sparcity of the feature weights.Finally,our numerical results show that the sparse effect of the regular dual averaging method gives satisfactory results that compare favorably with the stochastic gradient descent method.
Keywords/Search Tags:neural network, sparsity, stochastic gradient descent, dual averaging method
PDF Full Text Request
Related items