Font Size: a A A

Parallel Computing Of Deep Learning Based On Hadoop

Posted on:2018-03-14Degree:MasterType:Thesis
Country:ChinaCandidate:R Q ZhangFull Text:PDF
GTID:2428330596989263Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
With the continuous development of the technology of machine learning,deep learning model,represented by deep neural networks,has been widely used in various fields such as image recognition,speech recognition,natural language processing and so on and provide them with the best solution.Different from the shallow networks,with the increase of the number of neural networks and the number of neurons,deep models can describe more complex function transformation,and thus has stronger learning ability and expression ability.Particularly,they can learn independently by observing the data,without the need to know the specific solution to the problem.However,deep neural networks are more difficult to train than shallow neural networks.By using parallel computing methods,computing resources and efficiency problems deep neural networks are faced with can be solved.In this paper,some existing parallel computing models of neural networks in distributed environment were analyzed.In view of the defects of implementation strategy in these models,a new parallel learning algorithm was proposed based on convolutional neural network.Convolutional neural networks use “local receptive fields”,"shared weights" and "pooling" ideas to optimize and improve the traditional fully connected neural network.They have good performances especially in the study of two-dimensional topology such as image classification and image recognition.The traditional convolutional neural networks generally use the standard error back-propagation algorithm for serial training.With the rapid growth of data scale,the single serial training is also time consuming and occupies many system resources.In this paper,taking handwritten numeral image recognition as application scenarios,parallel training in convolutional neural networks is studied and experimentally analyzed.In order to realize the parallel training of convolutional neural network with mass data,a parallel training model of error back propagation based on MapReduce framework was proposed in this paper.This model combined the standard error back-propagation algorithm with error back-propagation algorithm,using data parallel,divided large data sets into several sub sets,carried out parallel processing with small amount of loss of accuracy.At the same time,Gauss' s elastic deformation was used to expand the MNIST data set to form a new large data set for image recognition test.Experiments showed that this algorithm has good adaptability to the scale of the data.With Hadoop distributed computing platform used and appropriate number of slices of cases selected,the algorithm can greatly improve the training efficiency of convolutional neural networks.
Keywords/Search Tags:Deep learning, Convolutional neural networks, Error back-propagation algorithm, Hadoop data parallelization
PDF Full Text Request
Related items