Font Size: a A A

Research On Compiling Optimization Method For Deep Learning Extension Algorithm

Posted on:2018-03-19Degree:MasterType:Thesis
Country:ChinaCandidate:W YangFull Text:PDF
GTID:2348330515478435Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
In recent years,with the development of artificial intelligence and the wide application of big data,the deep learning technology has profoundly advanced.for its own characteristic to predict correctly,deep learning has been utilized for natural language processing,graphic identification,intelligent robot,recommendation engine technology,voice translation,autonomous driving and many other fields.Many different network structures are introduced in deep learning to train the network model,of which contains a bunch of interconnected network layers.Despite the remarkable achievements of deep learning,it is not a solution to everything.In addition,deep learning methods themselves still have a lot of problems,such as how to better extract the characteristics,how to improve the network model of prediction and how to reduce memory usage as well as time consumption.To apply deep learning to a wider range of platforms and fields,the scale and the performance of the model will become a crucial issue.Nowadays many researches on deep learning aim at improving deep learning model efficiency through the optimization of the network model,library function,the hardware optimization support and the utilization of the efficient programming framework and method optimization to improve the speed of the learning model.In this paper,we study the following aspects to optimize deep learning method.1.Based on the existing implementation of Faster RCNN model for object detection and classification in depth learning,we reconstruct the existing Python Program by using C++ language.By using high performance programming language,the detection time of the model is significantly shortened.2.GPU programming is utilized to parallelize the parts that can be performed concurrently in reconstruction.Massive data calculation considered,parallelizationhas tremendous advantages over serialization in efficiency.Besides,reasonable scheduling is implemented to optimize the execution order in GPU and CPU,which can reduce the overhead of data transmission between GPU and CPU.3.In the special application scenario,the preprocessing method can be used to preprocess the input data of deep learning.In this paper,we take the application of Faster RCNN in object detection to intelligent video monitoring as an example.By using GMM Gaussian mixture model,the segmented image is subjected to background erasure processing and only the foreground portion is retained.Meanwhile the scale of calculation is significantly reduced.4.GPU programming needs a certain basis and programmer need to manually program the parallel regions according to experience,we extend the domain programming language Para C to support the common algorithms in deep learning and expand the basis operations of Para C in the paper.The paper uses Faster RCNN for an example to enable Para C to support automatic parallelization of codes.In this paper,the whole process to deep learning is introduced through the application of Faster RCNN.We have many aspects of optimization for deep learning and several experiments to compare the performance.The experimental results show that the whole process of Faster RCNN has a good optimization result.
Keywords/Search Tags:deep learning, optimization, GPU programming, domain programming language
PDF Full Text Request
Related items