Distributed Machine Learning With Adaptive Sample Selection

Posted on:2021-04-01

Degree:Master

Type:Thesis

Country:China

Candidate:H Gao

Full Text:PDF

GTID:2428330647451042

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

In recent years,artificial intelligence technologies have been successfully applied to many fields such as computer vision,speech processing,and natural language processing.At the same time,with the increasing complexity of application scenarios,we often need to use massive training data and large-scale machine learning models to achieve goals.In large-scale machine learning tasks,it is difficult to train models with a single machine.Distributed machine learning technology using multiple machines to work together has become the mainstream solution.In the training process of most machine learning tasks,each epoch requires all training samples to participate in training.For tasks with a large number of training samples,one epoch requires a lot of time.In addition,during the distributed training process,multiple machines inevitably need to communicate to exchange information,which will bring additional communication overhead.Therefore,both computing overhead and communication overhead will affect the training speed in distributed training.In the previous research,we proposed Adaptive Sample Selection algorithm(ADASS),which can reduce the computing cost in training process.This paper introduces the ADASS algorithm into distributed machine learning,and designs a distributed machine learning algorithm combined with adaptive sampling(ADASS-DML).At the same time,this paper also designs algorithm for reducing the communication overhead of ADASS-DML,which provides a solution for distributed machine learning with both efficient computation and efficient communication.Specifically,this article includes the following contributions:1.In this paper,we apply adaptive sample selection to distributed machine learning,and ADASS-DML algorithm is designed and implemented under the currentlycommonly used communication frameworks(including Parameter Server framework and Ring All Reduce framework).This algorithm can adaptively select some important samples to participate in the next epoch training according to the real-time situation,thereby speeding up the training process without sacrificing model accuracy.This paper verifies the effectiveness of the ADASS-DML algorithm through experiments on real data sets,and compares the training speed of the algorithm under two communication frameworks.Then we find that even in the Ring All Reduce framework with higher training efficiency,the communication overhead is still a bottleneck that affects the efficiency of distributed training.The following two communication compression algorithms are designed to solve this problem.2.In the Ring All Reduce framework,the gradient of communication between working nodes is usually 32-bit floating point numbers.Currently,there are some quantization algorithms on other communication frameworks that can represent the communication tensor with lower bits to reduce communication overhead,but these algorithms cannot be directly applied to the Ring All Reduce framework.Therefore,this paper designs and implements a quantization algorithm Q-ADASS combined with adaptive sample selection under the Ring All Reduce framework.Experiments on real data sets show that the algorithm can use low bits to represent the tensor of communication while performing adaptive sampling,and does not affect the accuracy of the final model.3.The communication compression brought by the quantization algorithm can not completely solve the communication overhead problem in distributed ADASS algorithm.Based on the characteristics of the Ring All Reduce framework,this paper designs and implements a random sparse algorithm RS-ADASS combined with adaptive sample selection.The algorithm does not need to communicate the complete gradient when synchronizing model.It only needs a small part of the dimensions of the communication gradient.Experiments on real data sets show that the algorithm can further reduce the training communication overhead without sacrificing the accuracy of model.

Keywords/Search Tags:

Machine Learning, Distributed Computing, Sample Selection, Communication Compression

PDF Full Text Request

Related items

1	Research Of Sample Selection
2	Research And Application Of Sample Selection In Machine Learning
3	Support Vector Machine Based On Boundary Sample Selection
4	Large Data Sets Sample Selection Based On Map Reduce
5	Developing and Evaluating Methods for Mitigating Sample Selection Bias in Machine Learning
6	Research And Application Of Network Intrusion Detection Technology Based On Active Learning Support Vector Machine
7	Active Sample Selection Algorithm And Its Application In Face Detection
8	Research And Implementation Of Distributed Machine Learning Algorithms Orchestration System For Big Data Processing
9	Research On Ensemble Learning Integrated With Non-labeled Sample Selection
10	The Research And Implementation Of Distributed Deep Learning Optimization Methods For Edge Computing