Communication Optimization In Decentralized Distributed Machine Learning

Posted on:2020-12-20

Degree:Master

Type:Thesis

Country:China

Candidate:G D Zhang

Full Text:PDF

GTID:2428330575458252

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

With the rapid growth of data,big data-based machine learning has been widely used in recent years.When the data set is too large to be handled by one single ma-chine,we often need to design distributed learning methods on a cluster of multiple machines.Hence,distributed machine learning has attracted much attention.Reduc-ing communication traffic of distributed machine learning methods has become a hot research topic.There are multiple machines working together to complete a distributed machine learning task.The communication cost between different machines is high.Therefore,it is necessary to improve the communication efficiency of distributed learning meth-ods.The existing distributed learning frameworks can be divided into two main cat-egories:centralized distributed framework and decentralized distributed framework.The centralized distributed framework consists of one central node and several work-ers.The communication cost on the central node is high.For decentralized distributed framework,there is no central node and the communication burden is balanced among all nodes.Although the communication burden is balanced in decentralized distributed frameworks,the communication cost is high when the model is large.Furthermore,there is high synchronization cost for synchronous decentralized methods.This pa-per focuses on decentralized distributed framework and improves the efficiency of the decentralized distributed machine learning by reducing the communication cost.This paper proposes three methods,which optimize communication from three aspects:da-ta partitioning,asynchronous communication,and gradient quantization.The main contributions of this paper are listed as follows:1.Most existing distributed learning methods are instance-distributed,which par-tition the training data by instances.In this paper,we propose a new distributed ma-chine learning method called feature-distributed stochastic variance reduced gradient(FD-SVRG)for the high-dimensional linear classification tasks.FD-SVRG partitions data by features.Experimental results on real data demonstrate that FD-SVRG can outperform other state-of-the-art distributed methods in terms of both communication cost and wall-clock time,when the data dimensionality is larger than the number of data instances.2.RingAllreduce is a common synchronous decentralized distributed method.In this paper,we propose an asynchronous distributed method called A-RingAllreduce.Original RingAllreduce is a synchronous method,while A-RingAllreduce is an asyn-chronous method.Experimental results on real data demonstrate that A-RingAllreduce can effectively reduce the synchronization cost when the speed of the nodes in the clus-ter is different.And A-RingAllreduce can achieve the same accuracy compared with other synchronous methods.3.RingAllreduce needs to communicate float parameter vectors or gradient vec-tors among different machines.In this paper,we propose a quantized distributed method called Q-RingAllreduce.Q-RingAllreduce quantizes the gradients and rep-resents the gradients with fewer bits.Experimental results on real data demonstrate that Q-RingAllreduce has lower communication compared to RingAllreduce.When the gradients are represented with 4 bits or 8 bits,Q-RingAllreduce achieves the same accuracy compared with RingAllreduce.

Keywords/Search Tags:

Distributed, Machine Learning, Decentralized, Communication Opti-mization

PDF Full Text Request

Related items

1	Distributed Optimization Control Of The Multi-agcnt Systems
2	Research On Distributed RDF Query Processing
3	Research And Application Of Performance Opti Mization Method In Multi-media Distributed System
4	Manipulator Motion Planning Algorithm Based On Monocular Vision Estimation
5	New Approaches To The Problems Of Continuous Minimax Optimizations
6	Research On Discrete Cross Modal Hashing Method Based On Ranking
7	Research On Convergence Of Decentralized Machine Learning Algorithms For Unstable Network Environments
8	Research On Improving Strategies And Algorithms Of Ant Colony Optimization
9	Research Of RMB Optical Character Recognition Algorithm And System Implementation
10	Research On 3D Model Designing Methods For Ordinary Users