Font Size: a A A

Research On Multi-Hierarchy Parallel Algorithm Of Convolutional Neural Network On Cloud Computing

Posted on:2021-03-24Degree:MasterType:Thesis
Country:ChinaCandidate:Q S ShenFull Text:PDF
GTID:2428330632957712Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Artificial intelligence is a research hotspot in recent years.Convolutional neural network(CNN)is one of the powerful tools to solve the problem on pattern recognition.And the development of high-speed computer and high-speed network provides the possibility for further research on and practical application of convolutional neural network.The era of big data provides enough data for the training of convolutional neural network to ensure the accuracy of algorithm and prediction.However,the long duration of training and prediction of CNN impedes the further development of the algorithm.Parallelizaiton is an effective way to improve the efficiency of the algorithm.In order to accelerate the CNN training as much as possible without affecting the algorithm accuracy for changing the model framework,a new multi-hierarchical parallel algorithm of CNN is proposed in this paper.The main contributions are as follows:Firstly,Implementing convolution vectorization by SIMD.The SIMD instruction of CPU is used to vectorize convolution calculation,in order to eliminate the synchronization overhead of parallel computing.And reusing the data in the registers to reduce the time of accessing memory and improve the cache hit rate.Secondly,Implementing parallelization of convolutional layers by multi-cores CPU.The block matrix method is used to divide the input data,and the size of the matrix block is divided according to the number of CPU cores,the size of cache and some other factors.According to the affinity of the computing cores of CPU chip,these submatrices are assigned to different cores to implement the parallelization of convolutional layers.Thirdly,Implementing parallelization of CNN on Hadoop platform.The raw data are grouped and assigned to different DataNode groups to implement data parallelization.A pipeline parallelization manner is applied for the data on each DataNode group.The above processes are encapsulated in JNI and transplanted to Hadoop cloud computing platform.Our experiment results indicate that the speedup achieved by the parallelizaiton algorithm is very high.In the Parallel algorithm of SIMD instructions for vectorization with reusing registers,and data partition considering cache capacity,the experimental results show superlinear speedup.When transplanting to Hadoop platform,if there is no communication between machines and the data can be stored in memory,JNI can play sufficient performance of C program,and the speedup shows same magnitude as the original C language program.While it provides a user-friendly and convenient interface,computing efficiency is not lost as much as possible.
Keywords/Search Tags:convolutional neural network, SIMD instruction, multi-cores CPU, Hadoop cloud computing
PDF Full Text Request
Related items