Font Size: a A A

Convergence Analysis Of Adatpive Gradient Algorithms With Unbiased Estimation In Deep Learning

Posted on:2021-05-21Degree:MasterType:Thesis
Country:ChinaCandidate:S D ZhangFull Text:PDF
GTID:2428330626463432Subject:Computational Mathematics
Abstract/Summary:PDF Full Text Request
Since the concept of deep learning was introduced in 2006,deep learning technology has made great strides as computer hardware continues to innovate.More and more applications related to deep learning are increasingly found in people's work and lives,such as common recommendation systems,intelligent voice,quantified operations and autopilot.An excellent deep learning product is often based on an excellent deep learning model,and for a complete model typically includes networks,algorithms,data,and so on.In general,when the network structure and data samples fixed,a good optimization algorithm often means a more satisfactory experimental results.In deep learning optimization algorithm,adaptive gradient algorithm based on SGD algorithm is the birth of a class of very good and popular algorithms,and therefore is still this class of algorithms such as TenserFlow,Pytorch and other deep learning framework of the main optimization algorithm.The adaptive gradient algorithm is based on reverse propagation and gradient descent.In practice,we usually divide gradient-based optimization algorithms into first-order optimization algorithms and second-order optimization algorithms based on gradients.Although second-order algorithms tend to have faster convergence rates,they are also accompanied by large amounts of computing and storage,so first-order optimization algorithms remain the mainstream algorithms today.Adagrad,Rmssprop and Adam are outstanding examples in the first-order optimization algorithm based on gradient,because their excellent experimental performance has attracted a large number of researchers to study adaptive gradient algorithms.However,the theoretical analysis of adaptive gradient arithmetic is mostly based on satisfying convex or strong convex conditions,but the target function of deep neural network is often non-convex,so it is of great theoretical value and practical significance to consider the convergence analysis of adaptive gradient algorithm under non-convex conditions.The first two chapters of this article are an introduction to basic knowledge and concepts.In Chapter 3,we mainly analyze the convergence of adaptive gradient algorithms common under non-convex conditions,give proof of convergence of adaptive algorithms under non-convex conditions,and compare the experimental properties of these algorithms with numerical experiments.In the first half of Chapter 4,we have improved the traditional Rmssprop algorithm,put forward the Rmsprop-Norm algorithm and added the regular term RmspropW-Norm algorithm.The convergence of the two improved algorithms under non-convex conditions is analyzed,and the parameters of RmspropW-Norm algorithm are obtained.In the second half of Chapter 4,we propose a generalized adaptive gradient algorithm,analyze its convergence and give a sufficient condition for a more straightforward and easy-to-test algorithm.In the numerical experiment,we compared the results of two improved algorithms with the original algorithm Rmssprop,and verified the validity of the proposed algorithm.
Keywords/Search Tags:deep learning, adaptive gradient, non-convex, convergence, unbiased estimation
PDF Full Text Request
Related items