Font Size: a A A

An Accelerated Design Of Convolutional Neural Networks Based On Small And Medium-sized Fpgas

Posted on:2020-07-26Degree:MasterType:Thesis
Country:ChinaCandidate:J W YinFull Text:PDF
GTID:2428330578960829Subject:Information processing and communication network system
Abstract/Summary:PDF Full Text Request
In recent years,with the development of artificial intelligence deep learning technology,CNN(Convolution Neural Network)has received great attention in computer vision,image recognition classification and natural language processing with its excellent performance.In the trend of the big data era,the level of convolutional neural networks is getting deeper and deeper,and the network structure is more and more complex.Traditional serial CPU computing methods have become less and less able to meet people's needs in terms of power consumption and speed.Due to the parallel GPU,FPGA(Field—Programmable Gate Array)achieves better acceleration of convolutional neural networks,so it has received more and more attention.FPGAs outperform GPUs in terms of power consumption and have better applications in mobile embedded.In order to pursue more excellent acceleration effects,larger and larger FPGAs are gradually being used to accelerate design.Large FPGAs are expensive in terms of price and are not suitable for general application.Although small and medium-sized FPGAs have limited resources,if they are properly designed,they can still achieve good acceleration effects.In order to give full play to the role of small and medium-sized FPGAs and facilitate the deployment of low-end mobile embedded platforms,this paper proposes an accelerated design based on small and medium-sized FPGAs.The hardware platform used to accelerate the design is the Zynboard development board of the Zynq-7000 series,which is a low-end FPGA.Aiming at the problem of large memory and excessive number of parameters in convolutional neural networks,this paper proposes a method of puncturing and BWN(Binary-Weight-Networks)quantification,which guarantees the accuracy of convolutional neural networks.The pre-processing of the neural network reduces the memory of the network and reduces the number of parameters of the network.Then a framework of half-layer parallel mode is proposed based on the general design of FPGA accelerated convolutional neural network.According to the basis and method of hardware and software partitioning,the framework puts the convolutional layer,the pooling layer and the activation function of the convolutional neural network into a small part of the FPGA and accelerates the design.A layer with a large number of parameters such as a connection layer and a small amount of calculation is placed in the ARM portion for processing.This division takes full account of the characteristics of each layer of the convolutional neural network and the use of logical resources for small and medium-sized FPGAs.In the input data buffer,this paper proposes a multi-buffering method based on single buffering and double buffering.This method can quickly transfer the feature map data without interruption,which greatly improves the transmission efficiency.After that,the detailed acceleration of the specific tasks after the division is made.Finally,the accelerated design achieved good acceleration and was verified by the AlexNet convolutional neural network.At the time of verification,the test results were analyzed from multiple angles.From the aspect of test image acceleration,the acceleration of this design is about 12.5 times faster than that of the i5-4460 CPU,which is about 2.6 times faster than the Nvidia GTX750GPU.From the test image power consumption,the design power consumption is about 7.3%of the i5-4460CPU power consumption,which is about 1%of the Nvidia GTX750GPU power consumption.In terms of accuracy,the accuracy of this design on the board is at least 55.2%,which is less than 2%of the original accuracy of the network,which can meet accuracy requirements.Finally,the design achieved good acceleration.
Keywords/Search Tags:Small and medium-sized FPGA, Convolutional Neural Network, Pruning Quantization, Accelerated design
PDF Full Text Request
Related items