Research On Memory Reuse And Optimization Methods For Deep Learning System

Posted on:2020-11-25

Degree:Master

Type:Thesis

Country:China

Candidate:Y Ma

Full Text:PDF

GTID:2428330590458365

Subject:Computer system architecture

Abstract/Summary:

PDF Full Text Request

In deep learning,GPUs are usually used to accelerate the training process of deep neural networks.However,the limited physical memory of the GPU means that it is difficult to train large-scale deep neural network models.Existing memory optimization methods,including recompute and CPU-GPU transfer,can not achieve ideal training performance by applying single optimization to all layers of neural network.Because they ignore an important feature: the data transfer costs and computational costs of different layers in the neural network are inconsistent.To overcome the shortcomings of existing optimization methods,a layer-based memory reuse and optimization method,Layup,is proposed,which contains two strategies.First,according to the characteristics of asynchronous concurrent execution,the layers in the neural network are divided into two categories: compute-sensitive and transfer-sensitive by analyzing the transfer and recompute costs of different layers in the neural network.Different types of layers use different optimization strategies,the CPU-GPU transfer method is used to optimize the feature map of the compute-sensitive layer,and the recompute method is used to optimize the feature map of the transfer-sensitive layer.At the same time,the pipeline parallel method is used to overlap the data transfer and the calculation processes to further reduce the training performance overhead of the neural network.Secondly,by analyzing the memory usage in the training process,a memory reuse strategy for multiple intermediate data is proposed.The memory of gradient map data is reused in the way of sliding window.Based on the layer-wise computing characteristics of the neural network,the convolutional workspace and cuDNN handle data are reused layer by layer,further reducing the memory usage of deep neural network.The above method is implemented on Caffe system and tested on two different gpus.Experiments show that Layup can significantly reduce the memory usage of deep neural networks while maintaining low performance overhead.The memory consumption in training is reduced by up to 92%,while the performance overhead on the tested neural networks is only 12% on average.It even can train ResNet with 2500 layers using 12 GB memory(batch size = 16),which outperform SuperNeurons with a 30% improvement,further increasing the scale of extra-deep network models on a single GPU.

Keywords/Search Tags:

Deep learning, Deep neural network, GPU, Memory management

PDF Full Text Request

Related items

1	Research On Speaker Identification Based On Deep Learning
2	Research On Personalized Item Recommendation Based On Deep Learning
3	Research On Deep Learning Algorithm For Sequence Data
4	Research On Algorithms Of Speech Sentence Recognition Based On Deep Learning
5	On The Learning And Compression Of Deep Neural Network Structure
6	Research On Mobile Robot Relocalization With Deep Learing
7	Research On Detection Method Based On Deep Learning
8	Sentiment Classification Of Uyghur Text Based On Deep Learning
9	Research On GPU Storage Optimization For Deep Learning Applications
10	The Research Of SDN Traffic Prediction Based On Deep Learning