Font Size: a A A

Design And Optimization Of Convolution Array Accelerator Based On FPGA

Posted on:2020-05-22Degree:MasterType:Thesis
Country:ChinaCandidate:W F XuFull Text:PDF
GTID:2518306518963319Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development and rapid popularization of artificial intelligence technology,neural network applications in many fields,such as image classification,image semantic segmentation,image retrieval,object detection and other computer vision problems,and began to replace most of the traditional algorithms,gradually deployed Go to the terminal device.However,the amount of computation of the neural network is very large,so there are problems such as slow processing speed and high power consumption of the neural network on the hardware.Among them,CNN(Convolutional Neural Network)'s huge data movement and computational complexity bring huge power and performance challenges to hardware,which hinders CNN's application in embedded devices such as smartphones and smart cars.Therefore,the acceleration and optimization of CNN on the embedded side is increasingly urgent.Based on the characteristics of Winograd convolution algorithm,this paper implements a convolutional neural network accelerator based on Winograd convolution algorithm on FPGA.Firstly,a convolution kernel decomposition method for Winograd convolution is proposed.The convolution kernel larger than 3×3 is divided into multiple3×3 convolution kernels for convolution operation,and the unsynchronized long convolution operation is processed..Secondly,according to the characteristics of Winograd convolutional transformation matrix,the Winograd domain conversion module is designed.It does not need to use multiplication to calculate.It only needs simple row and column transformation and data shift to realize Winograd domain conversion,which greatly reduces the multiplication in convolution.frequency.Then,the maximum pooling of the accelerator pooling layer is realized by the pulsating array.The accelerator ReLU activation function is realized by a simple comparison circuit,and the calculation of the accelerator full connection layer is realized by matrix splitting.In this paper,the accelerator is tested on ZedBoard.The convolutional neural networks tested are AlexNet,and the test data set is ImageNet.The test results show that the accuracy of this FPGA accelerator is only 2.36% compared with the software,but the calculation speed of the convolutional layer is 10 times that of the CPU.The accelerator is faster than the CPU,slower than the GPU,but the most energy efficient and 101 times that of the CPU,5.8 times that of the GPU.At the same time,it has higher energy efficiency than other convolutional neural network accelerators.
Keywords/Search Tags:Convolutional neural network, Winograd convolution algorithm, FPGA, Accelerator
PDF Full Text Request
Related items