Font Size: a A A

FPGA-Based Accelerator For Convolutional Neural Network

Posted on:2020-03-14Degree:MasterType:Thesis
Country:ChinaCandidate:X Z LiangFull Text:PDF
GTID:2428330578959456Subject:Microelectronics and Solid State Electronics
Abstract/Summary:PDF Full Text Request
Convolutional neutral network is an important subfield of the emerging deep learning in recent years,has been a hotspot of current research of artificial intelligence.Its weights sharing structure makes it closer to the creatures' neural network.The advantages of high recognition accuracy and low model complexity make it widely used in machine vision,image recognition,speech recognition,image search and other applications.With the development of neural networks,the network model becomes deeper and more computationally intensive.Existing CPUs and GPUs are incapable of real-time for some low-latency situations.FPGAs are highly flexible and reconfigurable and contain a wealth of computing resources to fully exploit the parallelism inherent in convolutional neural network models,making it suitable for accelerating performance of applications based on convolutional neural networks.Based on the research of convolutional neural networks and characteristics of FPGA,an FPGA-based convolutional neural network accelerator is designed.The accelerator consists of multiple convolution processing units that can perform different calculations for different convolutional layers.A split calculation method is designed for large-scale convolution calculation,which divides the convolution process into more fine-grained parallel computing.Under the ping-pong operation,all the convolution layers are pipelined calculated simultaneously.Different parallel strategies and computing resource allocation are adopted for different layers to achieve a good balance of processing time in each stage of the pipeline,thereby realizing high throughput and high resources utilization rate.The model parameters are compressed by data qantization,and the registers are used as secondary storage to reduce accessing memory.The simulation results show that the proposed accelerator computation latency is 10.36ms@100MHz for AlexNet,achieves overall performance of 128.61 GOPS.It has higher computing performance than existing convolutional neural network accelerators.
Keywords/Search Tags:Convolutional neural network, FPGA accelerator, parallel acceleration, split calculation
PDF Full Text Request
Related items