Font Size: a A A

Neural Network BP Algorithm Research And Implementation Based On OpenCL

Posted on:2012-10-20Degree:MasterType:Thesis
Country:ChinaCandidate:Q H ZhangFull Text:PDF
GTID:2218330368982412Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
BP neural network is widely used in function approximation, pattern recognition, classification & data compression, data mining, speech recognition, text/language translation and image processing. However, due to the small problems such as slow convergence and east to fall into local minima exist in the algorithm of it, many researchers have improved the BP algorithm to accelerate the speed of network convergence. But for the large-scale neural network, it is possible to take more than 10 thousand times to learn to reach the final steady state, i.e. BP algorithm still need much time to learn to reach the steady state.Due to its lower price, lower power consumption, higher floating-point computational capacity and wider bandwidth, Graphic Process Unit (GPU) is developed rapidly, and has achieved much success in the fields of dense computation, physical simulation and parallel computing etc. Traditional serial BP algorithm based on CPU architecture can't be run directly in GPU since the differences in architecture. The programming of traditional graphic process unit (GPU) is based on specific platform, provider or hardware, so it is not general purposed.Since the BP algorithm under large-scale neural network is low efficient, BP improved algorithm with OpenCL technology is proposed in the paper based on the character of parallelism of the data of neural network. On the basis of a GPU being a coprocessor of CPU, the algorithm is master/slave pattern. The master (CPU) organizes the sample data into shared buffer or GPU memory area. The forward computing and backward learning are organized as kernel mode (GPU's kernel), and each graphic process unit runs the same kernel instance to try out the sample data. After finished, the master (CPU) fetches the try-out result to output. In this way BP algorithm is performed parallelized by the graphic process unit with strong floating-point computational capacity. The experimental result shows that the speed of improved BP algorithm is accelerated greatly than that of serial algorithm under the condition that the precision is not changed. At the same time, the problem of lower efficient BP algorithm is solved. For different applying field, the sample data can be organized properly to fit for BP algorithm. So the improved algorithm is general purpose and valuable practically.
Keywords/Search Tags:Neural network, BP algorithm, data parallel, OpenCL technology, stream processor GPU
PDF Full Text Request
Related items