The Sparse Method Of Convolutional Neural Network

Posted on:2019-10-10

Degree:Master

Type:Thesis

Country:China

Candidate:J Feng

Full Text:PDF

GTID:2428330593450046

Subject:Mathematics

Abstract/Summary:

PDF Full Text Request

In recent years,deep learning,or deep neural network(DNN),has been widely used in computer vision,speech recognition and natural language processing.As the popularity of deep learning increases,the DNN approach is favored by various practitioners and artificial intelligent research companies.Due to the large number of the deep neural network model,however,the DNN approach is faced with great difficulties in both computation and storage.As a result,deep neural networks are difficult to be implanted on,for example,some mobile devices.When we are faced with high dimensional data,the usual neural network with standardbatch processing has become inadequate and it calls for techniques to develop neural network with sparse parameters.One approach to maintaining the neural network sparsity is to 'remove'some unimportant parameters after training,and then retrain the remaining parameters.In this way,although the parameter achieves sparse effect in the new model,it is still necessary to train a large neural network to get the sparse network model.This thesis is devoted to the development of a regular dual averaging method based on an online optimization algorithm.The neural network in each training is based on a special backpropagation algorithm that sets the parameter less than a certain threshold to 0,which is the sparsity of neural network parameters in the training process.When the network training is completed,the neural network can achieve the desired sparse effect and the resulting sparse model can be directly used in the subsequent prediction and class tasks.The regular dual averaging method is investigated in this thesis both theoretically and numerically.We first compare the convergence of regular dual average method and stochastic gradient descent method.We theoretically prove these two method have comparable convergence behaviors.The standard stochastic gradient descent algorithm has the advantage of having good precision,while the regular dual average method provide an alternative approach to minimizing the loss function with appropriate regularization and,as a result,it is more effective to increase the sparcity of the feature weights.Finally,our numerical results show that the sparse effect of the regular dual averaging method gives satisfactory results that compare favorably with the stochastic gradient descent method.

Keywords/Search Tags:

neural network, sparsity, stochastic gradient descent, dual averaging method

PDF Full Text Request

Related items

1	The Reseach And Application Of Stochastic Gradient Descent And Dual Coordinate Descent Algorithm
2	Optimization Algorithms Of Neural Networks Weights Based On Stochastic Gradient Descent
3	Research On Improving The Convergence Performance Of Stochastic Gradient Descent In Distributed Machine Learning
4	Imbalanced Stochastic Gradient Descent Online Algorithm For SVM
5	A Research Of Stochastic Gradient Descent Algorithm
6	Improvement Of Adaptive Gradient Descent Method Based On Neural Network
7	A Research And Application On Stochastic Gradient Descent Algorithm In Distributed Cluster
8	Research On Dynamic Sampling Algorithms Based On Gradient Decent
9	Research On Distributed Stochastic Gradient Descent Algorithm
10	A Ranking Algorithm ListNet Based On Stochastic Gradient Descent