Convergence Analysis Of Adatpive Gradient Algorithms With Unbiased Estimation In Deep Learning

Posted on:2021-05-21

Degree:Master

Type:Thesis

Country:China

Candidate:S D Zhang

Full Text:PDF

GTID:2428330626463432

Subject:Computational Mathematics

Abstract/Summary:

PDF Full Text Request

Since the concept of deep learning was introduced in 2006,deep learning technology has made great strides as computer hardware continues to innovate.More and more applications related to deep learning are increasingly found in people's work and lives,such as common recommendation systems,intelligent voice,quantified operations and autopilot.An excellent deep learning product is often based on an excellent deep learning model,and for a complete model typically includes networks,algorithms,data,and so on.In general,when the network structure and data samples fixed,a good optimization algorithm often means a more satisfactory experimental results.In deep learning optimization algorithm,adaptive gradient algorithm based on SGD algorithm is the birth of a class of very good and popular algorithms,and therefore is still this class of algorithms such as TenserFlow,Pytorch and other deep learning framework of the main optimization algorithm.The adaptive gradient algorithm is based on reverse propagation and gradient descent.In practice,we usually divide gradient-based optimization algorithms into first-order optimization algorithms and second-order optimization algorithms based on gradients.Although second-order algorithms tend to have faster convergence rates,they are also accompanied by large amounts of computing and storage,so first-order optimization algorithms remain the mainstream algorithms today.Adagrad,Rmssprop and Adam are outstanding examples in the first-order optimization algorithm based on gradient,because their excellent experimental performance has attracted a large number of researchers to study adaptive gradient algorithms.However,the theoretical analysis of adaptive gradient arithmetic is mostly based on satisfying convex or strong convex conditions,but the target function of deep neural network is often non-convex,so it is of great theoretical value and practical significance to consider the convergence analysis of adaptive gradient algorithm under non-convex conditions.The first two chapters of this article are an introduction to basic knowledge and concepts.In Chapter 3,we mainly analyze the convergence of adaptive gradient algorithms common under non-convex conditions,give proof of convergence of adaptive algorithms under non-convex conditions,and compare the experimental properties of these algorithms with numerical experiments.In the first half of Chapter 4,we have improved the traditional Rmssprop algorithm,put forward the Rmsprop-Norm algorithm and added the regular term RmspropW-Norm algorithm.The convergence of the two improved algorithms under non-convex conditions is analyzed,and the parameters of RmspropW-Norm algorithm are obtained.In the second half of Chapter 4,we propose a generalized adaptive gradient algorithm,analyze its convergence and give a sufficient condition for a more straightforward and easy-to-test algorithm.In the numerical experiment,we compared the results of two improved algorithms with the original algorithm Rmssprop,and verified the validity of the proposed algorithm.

Keywords/Search Tags:

deep learning, adaptive gradient, non-convex, convergence, unbiased estimation

PDF Full Text Request

Related items

1	Application And Research Of Adaptive Optimization Algorithm In Deep Learning
2	Projection-free Online Learning
3	PID-based Optimization Method With Applications
4	Improvement Of Gradient Sparsification In Distributed Deep Learning
5	Improvement Of Adaptive Gradient Descent Method Based On Neural Network
6	Spectral Estimation Of Random Signals And Its Improvements
7	Research On Value Function In Deep Reinforcement Learning
8	Observation Of Unbiased Estimation Of Linear Systems With Time-delay And Unknown Input
9	The Modification And Convergence Analysis Of MCA Algorithm
10	Research On Crowd Density Estimation Algorithm Based On Deep Learning