Design And Optimization Of Convolution Array Accelerator Based On FPGA

Posted on:2020-05-22

Degree:Master

Type:Thesis

Country:China

Candidate:W F Xu

Full Text:PDF

GTID:2518306518963319

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

With the rapid development and rapid popularization of artificial intelligence technology,neural network applications in many fields,such as image classification,image semantic segmentation,image retrieval,object detection and other computer vision problems,and began to replace most of the traditional algorithms,gradually deployed Go to the terminal device.However,the amount of computation of the neural network is very large,so there are problems such as slow processing speed and high power consumption of the neural network on the hardware.Among them,CNN(Convolutional Neural Network)'s huge data movement and computational complexity bring huge power and performance challenges to hardware,which hinders CNN's application in embedded devices such as smartphones and smart cars.Therefore,the acceleration and optimization of CNN on the embedded side is increasingly urgent.Based on the characteristics of Winograd convolution algorithm,this paper implements a convolutional neural network accelerator based on Winograd convolution algorithm on FPGA.Firstly,a convolution kernel decomposition method for Winograd convolution is proposed.The convolution kernel larger than 3�3 is divided into multiple3�3 convolution kernels for convolution operation,and the unsynchronized long convolution operation is processed..Secondly,according to the characteristics of Winograd convolutional transformation matrix,the Winograd domain conversion module is designed.It does not need to use multiplication to calculate.It only needs simple row and column transformation and data shift to realize Winograd domain conversion,which greatly reduces the multiplication in convolution.frequency.Then,the maximum pooling of the accelerator pooling layer is realized by the pulsating array.The accelerator ReLU activation function is realized by a simple comparison circuit,and the calculation of the accelerator full connection layer is realized by matrix splitting.In this paper,the accelerator is tested on ZedBoard.The convolutional neural networks tested are AlexNet,and the test data set is ImageNet.The test results show that the accuracy of this FPGA accelerator is only 2.36% compared with the software,but the calculation speed of the convolutional layer is 10 times that of the CPU.The accelerator is faster than the CPU,slower than the GPU,but the most energy efficient and 101 times that of the CPU,5.8 times that of the GPU.At the same time,it has higher energy efficiency than other convolutional neural network accelerators.

Keywords/Search Tags:

Convolutional neural network, Winograd convolution algorithm, FPGA, Accelerator

PDF Full Text Request

Related items

1	Design Of Neural Network Accelerator In Multiple Convolutional Modes
2	Optimization And Implementation For FPGA-based Deep Learning Accelerator
3	Zynq-based Convolutional Neural Network Embedded Acceleration System Design
4	Accelerator Design And Research Of Depthwise Separable Convolutional Neural Network Based On FPGA
5	Research On CNN Network Acceleration For Image Classification Based On FPGA
6	Research And Implementation Of SSD Target Detection Technology Based On FPGA Accelerator
7	Research On Acceleration Of Convolutional Neural Networks On FPGA Based On OpenCL
8	Research On Key Technologies Of High Performance Accelerator For Convolution And Recurrent Neural Networks
9	Face Recognition Algorithm And Circuit Design Based On Embedding Feature Of Convolutional Network
10	The Design And FPGA Verification Of A CNN Accelerator With Depthwise Separable Convolutions