Font Size: a A A

The FPGA Programmable Neural Network Processor Design

Posted on:2019-05-18Degree:MasterType:Thesis
Country:ChinaCandidate:B R ZhaoFull Text:PDF
GTID:2428330572451691Subject:Engineering
Abstract/Summary:PDF Full Text Request
Deep learning is one of the most important breakthroughs in the field of artificial intelligence since the new century.It has made phenomenal achievements in image recognition,speech recognition,computer vision and natural language processing.In particular,convolutional neural network(CNN)is the leader in deep learning technology,which also constantly evolves,from Le Net,Alex Net,to Google Net and VGGNet,etc.The new classic CNN networks are endless;from CNN,R-CNN,Fast R-CNN to Faster R-CNN and R-FCN continue to emerge.Due to the importance and flexibility of CNN,it is very urgent to promote its landing applications.The diversity of CNN determines that using ASICs to accelerate is more challenging.In addition,the convolutional neural network has the problems of structural diversity and large amount of data exchange.In view of the above problems,this study proposes a FPGA programmable neural network processor design.The system adopts a transport triggered architecture combined with multi-channel DMA,multi-port memory and dedicated pooling channels to form a data transport network and effectively solves the problems of network structure adaptability and massive data exchange.Experiments show that when the system is in VGG16 network acceleration,the throughput rate reaches 197.1GOPS.This scheme has the characteristics of large system parallelism,programmable,online configuration and high processing speed.Compared with other design adapted to a variety of neural network,this structure save 46.5% hardware multiplier resources,and faster 40% of GOP than other non-pipelined program at least.The structure has the advantages of large parallelism,resource reuse,programmable,online configuration and processing speed and so on.The article first introduces the background and development status of convolutional neural networks;then it introduces the classic convolutional neural networks,which includes a single neuron model,network structure and learning algorithm;then it explains in detail the programmable convolutional neural network processor.The overall design plan mainly includes three aspects: software interface,overall hardware structure and resource statistics, and memory address allocation.According to the overall scheme,this paper conducts detailed design and verification of each module.Finally,this paper adopts Xilinx company XCK325 T series FPGA combined with DDR memory chip,designed a hardware verification system,and verified the feasibility of the processor.Through accelerated experiments on the two networks of Le Net and VGG16,the performance of the processor in terms of speed,resource consumption,power consumption,and precision loss was described.The results show that the processor can effectively solve the structural adaptability and massive data exchange problems in convolutional neural network acceleration calculation while satisfying the requirements of throughput rate and low power consumption.
Keywords/Search Tags:Deep Learning, Convolutional Neural Network, Parallel Computing, FPGA
PDF Full Text Request
Related items