Parallel Optimization Algorithm For Deep Convolutional Neural Network Based On MapReduce

Posted on:2023-01-20

Degree:Master

Type:Thesis

Country:China

Candidate:Y Li

Full Text:PDF

GTID:2558307124971489

Subject:Computer technology

Abstract/Summary:

Deep Convolutional Neural Network(DCNN)is a kind of feedforward neural network that contains convolutional computation and deep hierarchy,which has excellent feature extraction and generalization capabilities,and is widely used in image classification,object tracking and detection,natural language processing and other fields.However,with the advent of the big data era,the big data generated in various fields is growing explosively.In order to cope with the the rapidly increasing amount of data,the traditional DCNN algorithm will face an exponential increase in training cost with the growth of data volume.Similarly,the training complexity of DCNN models will also increase with the increase of task complexity in big data environments.At this time,the demand for DCNN model training under large-scale datasets could not be achieved by upgrading the hardware in a single machine alone.Therefore,the idea of parallel training has received more and more attention and research.One of the main directions of current research is to improve the training method of traditional DCNN models and combine it with the distributed computing model.At present,the proposed parallel deep convolutional neural network algorithm based on MapReduce has reduced the training cost of the traditional DCNN algorithm to a certain extent,but there are still the following problems: 1)excessive computation of redundant features increases the burden of model training;2)insufficient convolution performance affects the overall efficiency of model training;3)low efficiency of parallel combining reduces the distributed performance of model training;4)excessive computation of noise features reduces the training efficiency of the model.Aiming at the above shortcomings,a Winograd-based parallel deep convolutional neural network optimization algorithm(WP-DCNN.)is proposed.First,a feature filtering strategy based on cosine similarity and normalized mutual information(FF-CSNMI)is designed,which solves the problem of excessive computation of redundant features.Next,a MapReduce-based Parallel Winograd Convolution strategy(MR-PWC)is presented,which solves the problem of insufficient convolution performance.Finally,a load balancing strategy based on task migration(LB-TM)is presented,which solves the problem of low efficiency of parallel combining of parameters.Experiments show that the proposed algorithm reduces the training cost of DCNN significantly in big data environment,and also improves the training efficiency of parallel DCNN.Aiming at the above shortcomings,a parallel deep convolutional neural network optimization algorithm based on Fast Fourier Transform(PDCNN-FFT)is proposed.First,a Image Denoising strategy based on Difference Hash and Non-Local Means(ID-DHNLM)is designed,which solves the problem of excessive computation of redundant features.Then,combined with MapReduce,a parallel convolution strategy based on Fast Fourier Transform(PC-FFT)is presented,which solves the problem of insufficient convolution performance.Finally,a load balancing strategy based on the LRU-K algorithm(LB-LRUK)is proposed,which solves the problem of low efficiency of parallel combining of parameters.Experiments show that the proposed algorithm reduces the training cost of DCNN in the big data environment,and also has a good classification performance relatively,which is suitable for model training of parallelized DCNN in large-scale datasets.

Keywords/Search Tags:

big data, parallel DCNN algorithm, MapReduce framework, load balancing strategy, parallel Winograd convolution, parallel FFT convolution

Related items

1	Parallel Frequent Itemset Mining Based On MapReduce
2	The Design And Implementation Of Parallel Computing Platform Based On MapReduce
3	The Research On Optimization Of Convolution Neural Network Parallel Algorithm Based On Distributed Environment
4	Research On Parallel FFT-based Convolution Algorithm For ARMv8 Many-core Processors
5	Research On Parallelization And Load Balancing Of Frequent Pattern Mining Algorithm Based On MapReduce
6	Parallel Database System Load Balancing Technology
7	Parallel Query Processing System On Large-scale RDF Data
8	Research On Parallel Convolutional Neural Network Algorithm Based On Big Data
9	Study On Load Balancing Integrated Parallel Programs Development Framework
10	Design And Implementation Of Convolution Neural Network Acceleration Based On FPGA