Research On Parallelization Of Convolutional Neural Networks

Posted on:2014-02-24

Degree:Master

Type:Thesis

Country:China

Candidate:B L Fan

Full Text:PDF

GTID:2248330398478458

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

Convolutional neural networks(CNNs) can learn high order and invariance features from raw inputs, and has been widely used in license plate detection,face detection, gesture recognition, speech recognition, image restoration and semantic analysis etc. However, training the networks is a computation-intensive and data-intensive problem, which need long time to complete. Traditional serial implementation method did not take advantage of the inherent parallelism of the networks, so it can not work efficiently. And, Traditional serial implementation method did not have strong scalability, so it can not deal with the mass of the information age.Google’s parallel programming framework named MapReduce has good scalability and fault tolerance, which is particularly suitable for parallel processing distributed huge amounts of data and has been the current mainstream technology of cloud computing platform for handling large data. This paper paralyze CNNs with MapReduc to enhance the parallelism and scalability of the algorithm, and use GPU to accelerate the computing process, achieved the following results:First, present a method to parallelize the training process of CNNs, which is named CNN-MR algorithm, and deploy CNN-MR algorithm to Hadoop Cloud Computing Platform which was built in our lab. The parallel networks is implemented in master/slave mode. The training data is split to many small splits, and these small splits are distributed to each slave node. Every slave node trains the same networks, then communicates with each other and gathers the intermediate results generated by each slave node, updates the networks in the batch mode at the last. After multiple iterations, the network converge to the threshold or reached the maximum number of iterations, the algorithm terminates.Second, present a method to parallelize CNN-MR algorithm, which is named CNN-MR-G algorithm, and deploy CNN-MR-G algorithm to Hadoop-G Cloud Computing Platform. Feature maps, neurons and weights of each layer map to block, threads, so neurons of the same layer can compute outputs, errors of outputs and the local gradient change ofweights.On the MNIST database of handwritten digits and license plate dataset which is created by our own, CNN-MR algorithm has good speedup and scalability, and CNN-MR-G algorithm accelerates the CNN-MR algorithm greatly.

Keywords/Search Tags:

Convolutional Neural Networks, Parallelism, MapReduce, CUDA

PDF Full Text Request

Related items

1	Research Of Object Detection Algorithm Based On CUDA Acceleration
2	The Research On Image Recognition Algorithm Based On Convolutional Neural Networks
3	Improve Parallelism Of Task Execution To Optimize Utilization Of MapReduce Cluster Resources
4	A Speed Optimization Of Cuda-convnet Deep Convolutional Neural Network Algorithm
5	The Research And Implementation Of Convolutional Neural Network Based On FPGA
6	Action Recognition Method Based On Sparse Auto-Combination Spatio-Temporal Convolutional Neural Network And Its MapReduce Implementation
7	Research On Hardware Parallel Acceleration For Novel Convolutional Neural Networks
8	A High-performance Sparse Convolutional Neural Network Based On GPU
9	Research On Simulation Of Fabrics Deformation With CUDA
10	An Efficient Lookup Table Of FFT Parallelism Using CUDA On GPUs