Research On GPU Storage Optimization For Deep Learning Applications

Posted on:2020-01-06

Degree:Master

Type:Thesis

Country:China

Candidate:X Ge

Full Text:PDF

GTID:2428330590458376

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

Massive computation parallelism of GPU makes it to be indispensable to train deep neural networks(DNN).As deeper and wider DNNs improve their accuracy,the memory footprint of their training becomes larger.However,the limited GPU DRAM capacity become the primary barrier to train deeper and wider network in GPU.In order to mitigate GPU memory capacity issue,the offload-prefetch and recomputation are proposed to reduce memory footprint for the training on a GPU.The offload-prefecth and recomputation can reduce the memory footprint further by trading less memory consumption with data transmission over slow PCIe bus and more computation overhead respectively.And only part of the layer's output to be retained in the GPU memory during forward propagation.However,these methods lack in-depth analysis of the data dependencies of different layers in backward computing,and lack support for the bottleneck layers.To solve this problem,the deep learning system MEDL,which fully exploit the recalculation to save GPU memory overhead,is proposed to provide memory optimization methods for non-linear and linear networks.In non-linear networks,the memory overhead of the utility layer is reduced by spatial reuse and recomputation,and the necessity of using the forward output of the tool layer as a checkpoint is eliminated.The GPU memory space occupied by the output of the utility layer can be released directly,greatly reducing the memory requirements when training neural networks using recomputation.In a linear network,the fine-grained double buffering technique reduces the memory overhead of the bottleneck layer,enabling a wider neural network to be trained in limited GPU memory.Experimental results show that the system MEDL can train deeper and wider neural networks in limited GPU memory.In the same experimental environment,MEDL's memory consumption was reduced by an average of 27.5% compared to the existing deep learning system,and with a performance improvement of 10.9%.When the batch size of the training is increased and existing deep learning system can't work,the MEDL can still work normally.

Keywords/Search Tags:

Deep neural networks, GPU memory management, Recomputation, Liveness Analysis

PDF Full Text Request

Related items

1	A Fingerprint Liveness Detection Algorithm Based On Deep Convolution Neural Networks
2	Research On Deep Neural Networks Based Models For Speech Recognition
3	Research On Face Liveness Detection Algorithm Based On Convolutional Neural Networks
4	Research Of Face Liveness Detection Based On Improved Convolutional Neural Network
5	Research On Key Techniques Of Compiler-based Memory Access Analysis And Optimization
6	Research On Palmprint Liveness Detection Based On Single Image
7	Research On Memory Reuse And Optimization Methods For Deep Learning System
8	Efficient And Reconfigurable Deep Convolutional Neural Network Acceleration System With 3D Stacked Memory
9	Behaviors Modeling And Analysis Of Big Data From Web Apps Using Machine Learning And Deep RNN Techniques
10	The Problem In Training Neural Networks With Limited Resources