Font Size: a A A

Research On Distributed Online Projection-free Optimization Algorithm For Deep Learning

Posted on:2021-08-20Degree:MasterType:Thesis
Country:ChinaCandidate:J C LiFull Text:PDF
GTID:2518306107969749Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Deep learning has gained widespread attention and research interest in recent years and achieved great success in many fields.With the in-depth research of deep learning technology,the scale of deep learning model and the amount of data in the network is increasing gradually which has provided a great amount of training data.However,using huge data sets to training large-scale deep learning models need to consume a lot of computing resources and training time.As the development of network technology and distributed optimization theory,many large-scale optimization problems that difficult to solve by centralized algorithms can complete by the cooperation of multiple computers.Hence,deep learning model training problem can be converted into distributed optimization problem.In order to reduce the consumption of computing resources and training time,how to design efficient distributed optimization algorithm is very important for the development of deep learning.This thesis focuses on the research of distributed online optimization algorithms for deep learning to efficiently handle constrained optimization problems and reduce the consumption of computing resources.Two novel distributed online projection-free optimization algorithms based on conditional gradient(Frank-Wolfe)technology are proposed respectively: Distributed online randomized block coordinate projection-free algorithm and distributed online event-triggered projection-free algorithm.The main research contents of this paper are as follows:To deal with the problem that the distributed online projection-free algorithm is too computationally intensive when processing high-dimensional data,combining a randomized block coordinate algorithm and Frank-Wolfe technology,this thesis proposes a distributed online randomized block coordinate algorithm.This algorithm only needs to randomly select a subset of the components of the(sub)gradient vector to update at each iteration,which reduces the amount of calculation at each iteration.We prove the convergence of the proposed algorithm by detailed theoretical analysis.Under local convexity,we prove the upper bound of regret for the algorithm is O(T~1/2).We provide simulation that verify the convergence of the proposed algorithm under different network topologies and the convergence of different node numbers.The thesis presents a distributed online event-triggered algorithm to solve the problem that the distributed online conditional gradient algorithm has communicate overhead problem caused by frequent communication.According to this algorithm,there is a threshold in the deviation between current state and last triggered state for each computing device in the network.Only when the deviation is greater than the threshold the communication between devices can be conducted.This algorithm can effectively control the number of communication between devices so as to reduce the communication overhead.This thesis proves the convergence of the algorithm by strict mathematical derivation.When the local objective function is convex,the algorithm can reach the upper bound of regret O(T~1/2).To solve the problem of large amount of calculation and communication overhead,this thesis presents two new algorithms,distributed online randomized block coordinate projection-free algorithm and distributed online event triggered projectionfree algorithm.These algorithms have certain reference value for the optimization and research of distributed online projection-free algorithm.
Keywords/Search Tags:Deep learning, Distributed online optimization, Projection-free, Randomized block-coordinate, Event-triggered
PDF Full Text Request
Related items