Font Size: a A A

Research On Efficient Distributed Parallel Algorithm Of Deep Learning Framework Tensorflow

Posted on:2020-12-15Degree:MasterType:Thesis
Country:ChinaCandidate:M J HeFull Text:PDF
GTID:2428330596975457Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of information age and the increasing amount of big data,the analysis and utilization of big data through cloud computing platform has become a powerful tool for competition among enterprises.The best way to grasp the big data resources more effectively is to utilize the deep learning of artificial intelligence.In the process of deep learning,a good deep learning framework is indispensable.This framework must have the characteristics of high flexibility,strong portability and support for multiple languages to meet the changing application needs.The open source TensorFlow of Google is such an excellent deep learning framework.TensorFlow,as a young open source project,is widely respected for its advantages in deep learning.However,with the application of deep learning in more and more complex problem processing,the deep learning model becomes larger and larger,and TensorFlow also shows its shortcomings.It often takes several hours,days or even weeks to train a complex deep learning model iteratively,which is an unacceptable time consumption in the rapidly developing information age.Even though TensorFlow has supported distributed iterative training and improved the problem of long training time,it still fails to meet the requirements and there are still places for optimization.How to optimize the distributed training algorithm,improve the utilization of computing equipment,reduce model training time is the main goal of this thesis.In order to achieve the above goals,this thesis first delved into the underlying implementation principle of TensorFlow,and analyzed and understood the specific implementation of TensorFlow system architecture,data flow graph,session management,distributed execution,data input and so on.Then,an optimization algorithm based on two distributed modes of TensorFlow(data parallelism and model parallelism)is designed and implemented.Aiming at data parallelism,the algorithm is implemented to replace the original linear execution mode with pipeline execution mode.And for model parallelism,the original random model partitioning mode is replaced by the innovative greedy algorithm.Finally,the validity of the two optimization algorithms proposed in this thesis is verified by several comparative experiments.The data parallelism optimization algorithm can increase the utilization rate of computing equipment by 28% on average.The model parallelism optimization algorithm can save 34% of training time on average.The purpose of TensorFlow acceleration training model is completed.
Keywords/Search Tags:Deep learning, TensorFlow, distributed parallelism, data parallelism, model parallelism
PDF Full Text Request
Related items