Research And Application Of FPGA Convolutional Neural Network Accelerator

Posted on:2024-02-14

Degree:Master

Type:Thesis

Country:China

Candidate:Q C Mei

Full Text:PDF

GTID:2558307136997249

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

Due to the significant number of parameters and computations required for convolutional neural networks,it is challenging for conventional embedded devices to achieve real-time processing capabilities.However,FPGAs have emerged as a primary hardware platform for deploying convolutional neural networks in embedded systems,due to their abundant computing resources,flexible development and deployment options,and low power consumption.This paper focuses on the deployment of linear convolution calculations and nonlinear activation function calculations of convolutional neural networks on FPGAs,as follows:Firstly,the model was quantized within an acceptable accuracy range,and an efficient and flexible convolution computing engine was designed based on 8-bit quantized data to accelerate convolution kernel operations of various sizes.To enhance the computing power of the DSP,double 8-bit multiplication was implemented on a single DSP,and the cascaded operation was further expanded to 16 DSPs.Additionally,the clock frequency of the DSP was doubled the system clock frequency to increase the computing power of the DSP.To address the cross-clock domain issue,corresponding time-sequence constraint solutions and a data cache strategy are proposed.Secondly,To address the difficulties in deploying nonlinear activation functions on FPGAs and the associated high resource consumption,a nonlinear activation function Auto-LUT method based on a lookup table and piecewise linear approximation method is proposed.This method can significantly reduce the resource consumption of lookup tables and triggers on chip while maintaining accuracy.Compared to the NN-LUT method,Auto-LUT reduces the approximate error by 4.32%,decreases the resource usage of lookup tables by 56.32%,and lowers triggers by 32.31%.Finally,based on the above optimization method,a face recognition system based on FPGA is designed and tested by using the open data set.The experimental results show that the face recognition time of FPGA-based face recognition system is only 22 ms.Compared with CPUs and GPUs,FPGAs have certain advantages in speed and power consumption.The performance of 1130.49GOPS@INT8can be achieved by the FPGA-based convolutional neural network accelerator designed,with only7.832 W power consumption,featuring high performance and low power consumption.

Keywords/Search Tags:

FPGA, Convolutional Neural Networks, Quantization, Linear Convolution Calculation, Nonlinear Activation Function Calculation

PDF Full Text Request

Related items

1	Research On CNN-based News Headline Similarity Calculation Model
2	Research On Acceleration Of Convolutional Neural Networks On ARM Embedded Platforms
3	Low Power And Multi-precision Computing Circuits Design For Irregular Network Layers In Neural Networks
4	Study On Different Types Of Noise Benefit Of Convolutional Neural Networks Under ReLU Activation Function
5	Research And Application On Convolutional Neural Network Algorithm Based On Improved Activation Function
6	Research And Application On Nonlinear Activation Function In Convolutional Neural Network
7	Research On Non-linear And Noise-tolerant Annihilation Neural Network And Its Application In Optimization Calculation
8	Research On Convolutional Neural Network In Augment Reality
9	Robust Stability Analysis For Neural Networks With Non-lipschitz Activation
10	Robust Stability Analysis For Neural Networks With Non-Lipschitz Activation