Research On Data Extraction And Communication Optimization For Distributed Deep Learning

Posted on:2020-02-17

Degree:Master

Type:Thesis

Country:China

Candidate:J Zhu

Full Text:PDF

GTID:2428330590458356

Subject:Cyberspace security

Abstract/Summary:

PDF Full Text Request

With the widening and deepening of neural network models in deep learning and the increase in the size of datasets,traditional single machine training can no longer meet people's needs.In order to achieve efficient training,distributed deep learning came into being.At the same time,the high communication overhead between physical machines in distributed deep learning also brings new challenges.In order to solve the problem of low training efficiency caused by high communication overhead,researchers have proposed a variety of methods to improve communication efficiency.Based on the characteristics of gradient parameters,an optimized transmission method based on quantization and compression is proposed.Based on the rule that large gradient values are conducive to convergence,if those gradients with larger absolute values are transmitted,the communication efficiency of distributed deep learning can be greatly improved.In the Fixed-Exponential Compression(FEC)algorithm,three optimization strategies are mainly used: First,the gradient parameter is filtered according to the specified exponential threshold,and the portion greater than or equal to the threshold are the gradients to be transmitted,and this step can increase the sparsity of the gradients.Second,for the filtered gradient values,each 32-bit floating point number is compressed to 5 bits using a 5-bit quantization algorithm.In order to make the quantization calculation more efficient,multi-threaded parallel computing is used to realize 5-bit,which greatly improves the efficiency of quantization.At the same time,the 5 bits of the 5-bit quantized output are divided into 1 bit and 4 bits,respectively,to improve space utilization.Third,due to step one,the sparsity of the gradient parameter is increased,and a zeros compression algorithm is performed on successive zero values,further reducing the parameters transmitted.The performance of FEC was tested in the cluster using the MXNet system,and four datasets of MNIST,CIFAR10,CIFAR100,and ImageNet1 K were trained using various neural networks.The experimental results compare the baseline and the 2Bit compression method in MXNet.In terms of gradient compression ratio,the FEC method achieves a compression ratio of 16.3 to 23.3 times,which is higher than that of 2Bit.In terms of communication efficiency,the FEC method accelerates the training by 1.18 to 3.34 times than the baseline.At the same time,the convergence of the FEC is basically matches the baseline.Compared to the 2Bit method,the stability and accuracy of convergence are better.

Keywords/Search Tags:

Distributed deep learning, Data compression, Distributed communication, Parameter server

PDF Full Text Request

Related items

1	Communication Optimization Of Distributed Deep Learning Based On Gradient Priority
2	Research On Parameter-exchanging Optimizing Mechanism In Distributed Deep Learning
3	Research And Implementation Of Efficient Parameter Communication Technology In Distributed Deep Learning System
4	Parameter Servers Placement For Distributed Training
5	The Research And Implementation Of Distributed Deep Learning Optimization Methods For Edge Computing
6	Research On The Optimization Strategy Of Model Parameters In Distributed Deep Learning
7	Research On Distributed Deep Learning Technology Based On Model Averaging
8	Research On Distributed Machine Learning Orientend Big Data Security Protection Technology
9	Communication Dynamic Optimizing Technology For Distributed Machine Learning
10	Communication Optimization Of Distributed Deep Learning System Based On Model Structure Characteristics