Font Size: a A A

Optimizing Scheduling Of Data Parallelization On Deep Learning Framework Tensorflow

Posted on:2020-12-21Degree:MasterType:Thesis
Country:ChinaCandidate:W Q HuangFull Text:PDF
GTID:2428330596476775Subject:Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of science and technology,artificial intelligence has been applied more widely in practical engineering,almost including medicine,diagnosis,robot control,financial field,law,scientific discovery,toys and other fields relating to human beings,which shows its importance.Whatever field,it will be related to the data set and model training of relevant fields.In order to improve the efficiency of model training under massive data,the distributed model trainning appears,including model parallel and data parallel,but there are still several problems: firstly,in data parallel,information transmission of synchronous update is overhead instead of increasing the speed of each iteration.What's more,because of the massive communication,each iteration time may be longer than the single machine model and short board effect is obvious with lower resource utilization.Secondly,gradient expiration occurs in asynchronous update.When some workers in the system still use the previous gradient calculation,and the gradient version has been continuously updated in the parameter server,gradient expiration occurs,which makes the gradient descent process unstable.For the previous problems,no reliable algorithm has been proposed and implemented.Firstly,this thesis innovatively proposed the model structure of annular updating with gradient selection to solve the above problems.In the traditional solution,gradient selection is used to reduce the amount of parameters for communication,while circular update is used to eliminate the bottleneck of parameter server in the system.The two methods are innovatively combined.The gradient selection algorithm is used to reduce the exchanged parameters,and then the ring algorithm is used to accelerate the exchange and averaging of parameters,so that the training time of the model is greatly reduced.Secondly,as to asynchronous update of data parallel model,this thesis proposes an improved algorithm based on expiration threshold,which can effectively reduce the occurrence of process instability during gradient descent.When a worker requests parameter update once,it is first compared with the current parameter version in the server.If the difference is greater than the threshold,the update will be abandoned;otherwise,it will be updated as a new version.Considering the training time and accuracy of face recognition model to verify the innovative theory and method proposed in this paper,the experimental results of the above methods are compared with the original method,which proves that the algorithm in this paper can effectively solve the problem of data parallelism,reduce the training time 19.7% of data parallelism model,improve the GPU utilization 22.9% and enhance the stability of the system.
Keywords/Search Tags:deep learning, TensorFlow, data parallelism, model training, circular update
PDF Full Text Request
Related items