Font Size: a A A

Research On Distributed Adaptive Stochastic Gradient Descent Optimization Algorithms With Spark MLlib

Posted on:2019-11-22Degree:MasterType:Thesis
Country:ChinaCandidate:S Q FanFull Text:PDF
GTID:2428330545476727Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Non-convex optimization problems are natural formulations in many machine learning problems(e.g.,(un)supervised learning,Bayesian learning).For optimization problems in machine learning and deep learning,Stochastic Gradient Descent(SGD)has become the de-facto iterative learning algorithm.Different variants of gradient descent algorithms have been proposed.However,none of them have considered the root cause of oscillaition when current training step overshoots the optimum.Distributed optimization methods have become a prerequisite as single machine cannot handle the rapidly growing data and model parameters.Unfortunately,traditional SGD is essentially serial,which makes it no longer applicable for large datasets.Therefor researchers have proposed a variety of distributed optimization algorithms.Apache Spark is a unified analytics engine for large-scale data processing,MLlib is Apache Spark's scalable machine learning library.However,the gradient needs to be synchronized once in every iteration in current implementation of MLlib SGD,which may lead to a very slow convergence rate.In addition,frequent parameter aggregation operations in MLlib SGD will introduce time-consuming shuffle operations when the dimension of model is high.In this paper,we propose a distributed adaptive stochastic gradient descent algorithm based on oscillation analysis,and integrate it with data-parallel MLlib SGD.We also propose an iterative optimization algorithm based on local search and a communication optimization algorithm based on parameter server to optimize the shortcomings of MLlib SGD's implementation.The primary contributions of this paper are highlighted as follows:(1)We propose a distributed adaptive gradient descent algorithm OAA-SGD which based on the analysis of the root causes of oscillations.In order to verify the effectiveness of the OAA-SGD,we adopt tool in Matlab to analyze the classification results and convergence behavior on logic regression benchmark on single machine.Experiments show that OAA-SGD achieves better classification results and faster convergence rate compared to existing methods.(2)We propose iterative optimization algorithm LS-SGD with local search to optimize the inefficient utilization of broadcast variables in implementation of MLlib SGD.LS-SGD adopts multiple round of local iterations on local data shards in each round of global iterations.Experimental results show that LS-SGD achieves a faster convergence rate than MLlib SGD on linear regression problem.Besides,the convergence property of LS-SGD is proved theoretically.(3)We propose a distributed adaptive SGD algorithm which based on OAA-SGD and LS-SGD to optimize the insufficient support in the orginal MLlib SGD's implementation.It combines LS-SGD algorithm with OAA-SGD algorithm,and leads to an effective control of the number of local iterations,as well as an adaptive adjustment of momentum term and learning rate in a distributed manner.(4)We propose OLP-SGD algorithm which based on parameter server to solve the single-point problem in MLlib SGD.We adopt parameter servers which based on spark to store,share and update parameters of our network model in a distributed manner.'experiments on linear regression dataset show that OLP-SGD algorithm achieves a 3?6 times speed-up ratio compared with MLlib SGD.Experiments on image classification problem show that OLP-SGD can achieve a good classification results which is not inferior to any existing algorithms.Furthermore,OLP-SGD algorithm also achieves a good node scalability.
Keywords/Search Tags:Optimization Algorithm, SGD, Deep Learning, Spark MLlib
PDF Full Text Request
Related items