Neural Network BP Algorithm Research And Implementation Based On OpenCL

Posted on:2012-10-20

Degree:Master

Type:Thesis

Country:China

Candidate:Q H Zhang

Full Text:PDF

GTID:2218330368982412

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

BP neural network is widely used in function approximation, pattern recognition, classification & data compression, data mining, speech recognition, text/language translation and image processing. However, due to the small problems such as slow convergence and east to fall into local minima exist in the algorithm of it, many researchers have improved the BP algorithm to accelerate the speed of network convergence. But for the large-scale neural network, it is possible to take more than 10 thousand times to learn to reach the final steady state, i.e. BP algorithm still need much time to learn to reach the steady state.Due to its lower price, lower power consumption, higher floating-point computational capacity and wider bandwidth, Graphic Process Unit (GPU) is developed rapidly, and has achieved much success in the fields of dense computation, physical simulation and parallel computing etc. Traditional serial BP algorithm based on CPU architecture can't be run directly in GPU since the differences in architecture. The programming of traditional graphic process unit (GPU) is based on specific platform, provider or hardware, so it is not general purposed.Since the BP algorithm under large-scale neural network is low efficient, BP improved algorithm with OpenCL technology is proposed in the paper based on the character of parallelism of the data of neural network. On the basis of a GPU being a coprocessor of CPU, the algorithm is master/slave pattern. The master (CPU) organizes the sample data into shared buffer or GPU memory area. The forward computing and backward learning are organized as kernel mode (GPU's kernel), and each graphic process unit runs the same kernel instance to try out the sample data. After finished, the master (CPU) fetches the try-out result to output. In this way BP algorithm is performed parallelized by the graphic process unit with strong floating-point computational capacity. The experimental result shows that the speed of improved BP algorithm is accelerated greatly than that of serial algorithm under the condition that the precision is not changed. At the same time, the problem of lower efficient BP algorithm is solved. For different applying field, the sample data can be organized properly to fit for BP algorithm. So the improved algorithm is general purpose and valuable practically.

Keywords/Search Tags:

Neural network, BP algorithm, data parallel, OpenCL technology, stream processor GPU

PDF Full Text Request

Related items

1	Research On Parallel Optimization Technology For Accelerating Data-intensive Algorithms
2	Design And Implementation Of Data-Parallel Memory System For Tiled Stream Processor
3	Parallel Accelerated Implementation Of Image Dehazing Algorithm Based On OpenCL
4	Parallel Analysis And Acceleration Method Of AES Algorithm Based On OpenCL
5	Study And Implementation Of High Performance Parallel Hierarchy Stream Memory System
6	Design And Implementation Of Parallel Decoding Techniques For HEVC Multiplex Video Stream Based On Multicore Processor
7	The Multi-thread Parallel AES Algorithm Based On Opencl
8	Parallel Optimization Technology Of Satellite Image Decompression Based On Multi-core Processor
9	Research On Parallel Accelerating Algorithm Based On OpenCL And Realization On FPGA
10	Stream Data Target Recognition Algorithm And Application Research Based On Domestic Processor