A Study Of Communication Optimization Algorithms On Deep Neural Networks

Posted on:2020-01-08

Degree:Master

Type:Thesis

Country:China

Candidate:Z Zhang

Full Text:PDF

GTID:2428330599461772

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

Deep neural networks have been successfully applied in the fields of image processing,machine translation and speech recognition.In the face of increasing data volume,the distributed training deep neural network model is an effective solution.However,there are still several problems in distributed training.First,in terms of system architecture,the current mainstream architecture is the parameter server architecture.It does not distinguish the computing nodes according to the characteristics of different layers in deep neural networks,resulting in excessive communication overhead.Second,in terms of communication data compression,the current mainstream method is gradient sparsification.The communication complexity of this method is too high,and the gradient values after sparsification are still large,which increases the communication overhead.Aiming at the shortcomings of the current deep neural network in distributed training,Hourglass architecture and Sparse Gradient Compression Algorithm are proposed to reduce the communication overhead of distributed training deep neural network from two aspects: system architecture and communication data compression.They speed up the training process and ensure the accuracy loss is within 1%.In the Hourglass architecture,allocating the calculation of fully connected(FC)layers and convolutional(CONV)layers into different computing workers.The majority of workers in the cluster participate in the calculation of CONV layers.The rest of workers are used to calculate the FC layers.The Hourglass architecture takes full advantage of the machines' computing power,making the parameters and gradients of FC layers communicated among FC workers instead of the overall cluster for reducing the entire traffic.Sparse Gradient Compression includes hierarchical gradient sparsification,quantization and communication delay.This technology specifically includes:(1)Hierarchical gradient sparsification algorithm reduces the communication complexity to O(8)7)2)9))(n is the number of computing nodes,m is the transmission time required for each byte size message)for the problem of high communication complexity in the existing research work.(2)Gradient quantization algorithm quantizes the sparse gradients to 2-bit values.(3)Communication delay algorithm makes each computing node calculate more parameter updates by performing multiple iterations of the stochastic gradient descent algorithm.The experimental results for image classification,language model and speech recognition on the CIFAR-10,ImageNet,PTB and LibriSpeech datasets show the effectiveness of the Hourglass architecture and Sparse Gradient Compression.In the multiple datasets and deep neural network models,compared with the state of the art results for different tasks,the Hourglass architecture and Sparse Gradient Compression have improved the training speed by about 2 to 15 times and 2 to 8 times for the compression ratio of communication data,while ensuring accuracy loss is within 1%.

Keywords/Search Tags:

Distributed computation, Deep neural networks, Communication optimization, Gradient compression, Communication delay

PDF Full Text Request

Related items

1	Distributed Optimization Algorithms Of Multi-agent Networks Under Complex Communication Conditions
2	Research And Implementation Of Efficient Parameter Communication Technology In Distributed Deep Learning System
3	Research On Performance Optimization For Distributed Graph Computation
4	Research On Data Extraction And Communication Optimization For Distributed Deep Learning
5	Compression And Optimization On Deep Neural Networks
6	Research On Distributed Optimization Method For Multi-agent System
7	Communication Efficient Distributed Parallel Stochastic Optimization Algorithms
8	Research Of Deep Neural Networks Optimization Based On GPU
9	Communication Optimization Of Distributed Deep Learning Based On Gradient Priority
10	Compression And Optimization Method For Deep Convolutional Neural Networks