Font Size: a A A

Model Training Performance Analysis Of Typical Deep Learning Frameworks In The Single GPU Environment

Posted on:2021-08-10Degree:MasterType:Thesis
Country:ChinaCandidate:H L DaiFull Text:PDF
GTID:2518306104494634Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the rapid development of deep learning,in order to help practitioners efficiently write and train models,a large number of deep learning frameworks have emerged.According to the programming paradigm,these frameworks can be divided into two categories: declarative and imperative programming frameworks.The most popular frameworks in these two categories are Tensor Flow and Py Torch.Tensor Flow uses static computation graph to represent computation process,while Py Torch uses dynamic computation graph.The former can optimize the computation graph before running,while the latter can better handle variable-length input.Due to different computation graphs,Tensor Flow and Py Torch are very different in framework design,task scheduling,and computation graph execution,making it difficult to compare and analyze the performance of each part of the two frameworks.In order to deeply analyze the performance difference between Tensor Flow and Py Torch training deep neural network(DNN)model in a single GPU environment and identify the key factors affecting performance,the performance model of training DNN in the single GPU environment is defined,and experimental evaluation is done based on the model.The performance model is based on the standard process of DNN training,considers factors such as I/O,memory copying,CPU processing,GPU processing,and computation graph optimization.It comprehensively reflects the performance of the entire training process.The experiment uses 7 popular DNN models covering CNN,RNN and Transformer network structures,benchmarks the training performance of these two frameworks,then makes qualitative and quantitative analysis and comparison.Performance analysis shows that in the single GPU environment,the factors such as task scheduling,data loading,and memory copying of the two frameworks have an impact on the overall performance of less than 3%,and the implementation of the key layers of the deep learning model is critical to the training speed.For most models,the optimization of the computation graph improves the training performance by no more than 2.5%,which means it has little effect on performance.The research results can provide technical guidances for deep learning practitioners in framework selection and performance optimization.
Keywords/Search Tags:Deep Learning, Comparison, TensorFlow, PyTorch
PDF Full Text Request
Related items