Font Size: a A A

FPGA Based Convolutional Neural Network Accelerator Design And Realization

Posted on:2018-11-24Degree:MasterType:Thesis
Country:ChinaCandidate:S Y WangFull Text:PDF
GTID:2348330512987995Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
With the development of computer performance and the futher research in marchine learning,convolutional neural network(CNN)becomes one of the most popular algorithm in artificial intellegnece.However,because of its complex structure and training method,CNN need a large amount of computational resource.Conventional CNN realization is based on generalized CPU to perform training,which are comparatively slow and hard for real-time applications.Field-programmable gate arrays(FPGA)technology have high flexibility and easy to perform parallel computing.It shows strong ability for many algorithm in artifical intelligence,including CNN.This thesis proposes a novel architecture for accerlerating the computation speed in CNN based on FPGA technology.ETL9 B Japanese Handwriting Database is used for verificating its effectiveness.The model got 99.7% accuracy and around 90% of processing time is reduced.The first chapter is the introduction of CNN.It mainly describes the basic conception of CNN,including the history and the development of CNN algorithm.For the understanding of the proposed design,the analysis of FPGA based CNN structures are provided in this chapter.The applications of CNN are also introduced.In the second chapter,the mathematical background of CNN is introduced,including picture convolution,pooling,activation function and back propagation algorithm for training Neural Network.Besides,the development in CNN structure are also provided.The third chapter describes the principle of computing activation function by Coordinate Rotation Digital Computer(CORDIC)algorithm.In order to accerlerate the calculation speed,this thesis modify the conventional CORDIC algorithm.A new rotation straytegy called Unified Rotation Strategy(URS)is proposed,which are the combination of Look-up Table Method and Greedy Algorithm.The URS-CORDIC can fast finish the iteration procedure in conventional CORDIC.Chapter four proposes the hardware artitecture for CNN and describes the detail design,including the CORDIC processor,convolution computing block,Max-pooling unit and the controlling block.The usage of on-chip memory is also analysed to avoid redundant memory accessing.At last,chapter five provides the simulation and realization results of proposed CNN hardware accerlarator,including the CORDIC processor and the convolution computing block.The system is tested by ETL9 B Japanese Handwriting Database.Compared with the realization in software,the training time is drastically reduced.It shows the great advantage of FPGA for training procedure in CNN.
Keywords/Search Tags:CNN, Hardware Accerleration, FPGA, CORDIC, URS-CORDIC, Parallel Computing
PDF Full Text Request
Related items