Deep Neural Network Model Parallelization And Optimization Based On GPU

Posted on:2017-03-30

Degree:Master

Type:Thesis

Country:China

Candidate:H Zhang

Full Text:PDF

GTID:2348330503472480

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

Currently, deep neural network plays a very important role in image processing, speech recognition and natural language processing field. However, due to the large amount of training data, result to the speed of model training is slow. In the past, to speed up the speed of learning by increasing the number of machines. Today, with promotion of the memory and computational ability on GPU(graphics processing unit GPU), it often uses GPUs to compute; but the growths of model are constrained by the small memory on GPU, larger models often can’t be stored on the GPU, resulting that the neural network with too many parameters can’t be trained on GPU.In view of the low efficiency on the deep neural network model training, it is proposed that multi-GPUs parallelize train the deep neural network models. In order to improve the efficiency of parallel training models, it’s optimized from the three aspects, to compute the deep neural network models in parallelism, first of all the model was divided into two parts which stored in the two GPUs respectively, so that two models can be parallel computed on two GPUs; optimize the models parallel training sequence, the deep neural network models use different parallel scheme in different areas, convolutional layers use data parallelization and in the fully connected layers use model parallelization computing. At the same time optimize the memory access to training data, add a data conversion layer into a parallel model structure to achieve data integration or exchange on GPU. Finally the training datasets are so big that we use parallel Mini-batch training methods to optimize data processing. Using the design of multi-GPUs accelerat deep neural network model parallelization, relying on a strong collaborative multi-GPUs parallel computing capability, combined with the data parallelization in deep neural network model parallel training, thus realizing the deep neural network model parallelization training accelerated.Based on Linux operating system and CUDA programming environment, an experiment uses MNIST, CIFAR10, and CAR datasets to test the algorithm. Results show that, the use multi-GPUs to parallel model training method compared with caffe methods under the premise of considerable accuracy, the training efficiency was increased by 20~30%, the loss of deep learning is smaller. At last deep neural network model parallelization method has been successfully applied to the automatic vehicle recognition systems.

Keywords/Search Tags:

Deep Neural Network, Deep Learning, Graphics Processor Unit, Model Split, Model Parallelization

PDF Full Text Request

Related items

1	Research Of Fined-grained Layer-wise Parallelism Strategy For Deep Learning Model On Many-core Platform
2	Optimization And Application Of Deep Learning Network For Dedicated NPU
3	Research On Selection And Distribution Optimization Method Of Deep Learning Model Based On Multi-User Collaboration
4	Research And Implementation Of The Embedded Graphic Processor Unit
5	Sentiment Analysis Of Short Text With Deep Neural Networks
6	Research And Application Of Web Attack Detection Model Based On Deep Learning
7	Model Compression And Forward Acceleration Based On Embedded Deep Neural Network
8	Research On Computational Optimization Technology Based On Deep Learning
9	Research And Design Verification Of Graphics System Processor
10	Research And Implementation Of Ranking Model Based On Deep Learning