Research On Multi-Hierarchy Parallel Algorithm Of Convolutional Neural Network On Cloud Computing

Posted on:2021-03-24

Degree:Master

Type:Thesis

Country:China

Candidate:Q S Shen

Full Text:PDF

GTID:2428330632957712

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Artificial intelligence is a research hotspot in recent years.Convolutional neural network(CNN)is one of the powerful tools to solve the problem on pattern recognition.And the development of high-speed computer and high-speed network provides the possibility for further research on and practical application of convolutional neural network.The era of big data provides enough data for the training of convolutional neural network to ensure the accuracy of algorithm and prediction.However,the long duration of training and prediction of CNN impedes the further development of the algorithm.Parallelizaiton is an effective way to improve the efficiency of the algorithm.In order to accelerate the CNN training as much as possible without affecting the algorithm accuracy for changing the model framework,a new multi-hierarchical parallel algorithm of CNN is proposed in this paper.The main contributions are as follows:Firstly,Implementing convolution vectorization by SIMD.The SIMD instruction of CPU is used to vectorize convolution calculation,in order to eliminate the synchronization overhead of parallel computing.And reusing the data in the registers to reduce the time of accessing memory and improve the cache hit rate.Secondly,Implementing parallelization of convolutional layers by multi-cores CPU.The block matrix method is used to divide the input data,and the size of the matrix block is divided according to the number of CPU cores,the size of cache and some other factors.According to the affinity of the computing cores of CPU chip,these submatrices are assigned to different cores to implement the parallelization of convolutional layers.Thirdly,Implementing parallelization of CNN on Hadoop platform.The raw data are grouped and assigned to different DataNode groups to implement data parallelization.A pipeline parallelization manner is applied for the data on each DataNode group.The above processes are encapsulated in JNI and transplanted to Hadoop cloud computing platform.Our experiment results indicate that the speedup achieved by the parallelizaiton algorithm is very high.In the Parallel algorithm of SIMD instructions for vectorization with reusing registers,and data partition considering cache capacity,the experimental results show superlinear speedup.When transplanting to Hadoop platform,if there is no communication between machines and the data can be stored in memory,JNI can play sufficient performance of C program,and the speedup shows same magnitude as the original C language program.While it provides a user-friendly and convenient interface,computing efficiency is not lost as much as possible.

Keywords/Search Tags:

convolutional neural network, SIMD instruction, multi-cores CPU, Hadoop cloud computing

PDF Full Text Request

Related items

1	Instruction-flow Scheduling Mechanism For High-performance SIMD DSP
2	Research On Security Enhancement Mechanism For Convolutional Neural Network Predictions In Cloud
3	ILP-SIMD: An instruction parallel SIMD architecture with short -wire interconnects
4	The Simd Compiler Optimization Methods Research
5	Research On Computation Methods Of Neural Networks And Applications Based On A Cloud Computing Platform
6	Research And Implementation Of Clustering And Neural Network Algorithm Based On Cloud Computing Platform Hadoop
7	The Design And Implementation Of A Mechanism For Branch Handling In SIMD
8	The Orchestration Of Instruction Issuing In Data Parallel Processors
9	Research On A Class Of Disease Analysis Model Based On Cloud Medical Treatment
10	High Resolution And Fast Space-Borne SAR Imaging Research Based On Heterogeneous SIMD Parallelism