An FPGA Implementation Of XNOR Neural Network Based On HLS

Posted on:2022-03-06

Degree:Master

Type:Thesis

Country:China

Candidate:Y L Yao

Full Text:PDF

GTID:2518306605466234

Subject:Advanced Synthesis

Abstract/Summary:

PDF Full Text Request

Convolutional neural network algorithms have been rapidly developed in recent years,making target detection,image classification and other technologies widely used.As the number of layers of convolutional neural networks continues to deepen,the amount of network parameters and calculations are greatly increased while improving the accuracy,which seriously affects the use of image classification and other technologies on embedded platforms.In order to solve the above problems,this paper adopts the design idea of lightweight XNOR network,binarizes the input feature maps and weights in the network,compresses the network model storage space,and facilitates the deployment of neural networks on embedded hardware platforms.Due to its low power consumption and low latency,FPGA is suitable for binary convolution operations.This article uses the Vivado HLS tool to design a dedicated FPGA accelerator for the optimized XNOR network,and makes the network run efficiently on the FPGA without losing the accuracy of network classification.The main work and contributions of this paper are as follows:(1)Through the analysis of XNOR network structure,the optimal design is carried out.Firstly,according to the correlation between convolution and normalization layer,the bias term in convolution layer is removed,which saves data storage space and computing resources.Secondly,the strategy of data filling is analyzed.If + 1 or-1 in binary is used for filling,the feature will disappear in some cases.In order to avoid this situation,in this paper,we explore two filling modes,which are odd and even filling and edge expansion.After experimental testing and analysis,we finally adopt edge expansion.In order to improve the accuracy of XNOR network and reduce the loss of feature information,residual structure is introduced between convolution layers.According to the original activation mode,the quadratic term is introduced to solve the problem of gradient disappearance to a certain extent.Finally,a lightweight network is constructed by integrating the above methods,and the network is trained by Tensorflow platform.The classification accuracy of 85% is achieved on CIFAR-10 dataset,which proves the effectiveness of the network design.(2)Through the analysis of the optimized XNOR network structure,the whole hardware structure is divided into three modules,namely data cache module,controller module and calculation module,and the whole deployment is completed on the FPGA platform.In the module design,we mainly focus on data access optimization and calculation optimization.In the aspect of data storage,this paper analyzes the way of data partition and data reuse,and selects the appropriate data cache module according to the amount of parameters in the network.In order to improve the throughput,the coarse-grained and fine-grained pipelining are explored respectively.In the convolution layer,the partial binary convolution and the full binary convolution are designed respectively.The computing units are arranged in the form of two-dimensional matrix,the stream input and special register are used to reduce the data waiting in the calculation process,at the same time,the data overlap generated in the convolution core moving process is maximized.In the normalization layer,shift is used instead of multiplication.In the pooling layer,the result output of a pooling block is completed in the form of pooling tree.Between the network layer,add the layer cache module,through ping-pong operation,realize the layer pipeline.The final design architecture is implemented by HLS,which has a speed improvement of nearly 30 times compared with CPU.

Keywords/Search Tags:

Convolutional Neural Network, XNOR-Net, FPGA, HLS

PDF Full Text Request

Related items

1	An FPGA Accelerated IP Design And Verification Base On XNOR Algorithm
2	Research On Binary Convolutional Neural Network And Its FPGA Implementation
3	A Hardware Acceleration Of Image Classification Algorithm Based On Convolutional Neural Network Implemented On FPGA
4	Research On FPGA-based Neural Network Binarization And Deployment Optimization
5	FPGA Based Convolutional Neural Network Application Research
6	A Convolutional Neural Network Accelerator Based On FPGA
7	Research On Parallel Acceleration Architecture Convolutional Neural Network Based On FPGA
8	Design And Application Of Convolutional Neural Network Accelerator Based On FPGA
9	Research On Optimization Technologies Of FPGA-based Convolutional Neural Network Implementation
10	Implementation And Verification Of Convolutional Neural Network Based On FPGA