Circuit Design And FPGA Verification Of AlexNet Convolutional Neural Network

Posted on:2022-03-30

Degree:Master

Type:Thesis

Country:China

Candidate:M Y Zhu

Full Text:PDF

GTID:2518306602966759

Subject:Master of Engineering

Abstract/Summary:

PDF Full Text Request

Under the background of the continuous improvement of the technology level of the chip manufacturing industry,coprocessors are also developing.Ram resources integrated on FPGA chips are gradually accumulating,and the performance is constantly improving.Based on the above characteristics,FPGA is of great significance for computing intensive tasks,and can effectively accelerate the hardware processing speed.Convolutional neural network is a typical model in current computing intensive tasks.CNN uses the cascade structure of neurons for convolution computation.Its representation learning ability can make it play an irreplaceable role in image segmentation in image processing.It has a wide range of applications in industry and academia,and has a key significance in image segmentation and pattern recognition in the field of visual recognition.However,the way to realize CNN on the platform of general processor is very limited.Research focus on how to use FPGA in CNN parallel development,in order to ensure timeliness and low power consumption.AlexNet is the forerunner who can prove the high efficiency of convolutional neural network in the diversified data information model,and also obtains the corresponding results by using GPU under the condition of ensuring the training efficiency.Therefore,it is of great significance to study the computational acceleration of the data information model of alexnet under the acceleration of complex CNN network model.However,for large-scale convolutional neural networks,the characteristics of computation intensive,memory intensive and resource consumption bring many challenges to the implementation of convolutional neural networks.This paper focuses on the optimization design of CNN module of AlexNet based on FPGA,and proposes an end-to-end neural network accelerator based on FPGA,which enables different network layers to work in pipeline structure at the same time to improve throughput.This paper proposes a method to find the optimal parallel strategy for each layer to achieve high throughput and high resource utilization(FC layer)has the characteristics of memory intensive,this paper designs and implements a batch based computing method,and applies it to the full connection layer(FC layer)to improve the utilization of memory bandwidth;through the application of two different computing modes in FC layer,the required on-chip buffer is significantly reduced.To sum up,this paper proposes a method to find the optimal parallelism of each network layer.In the FC layer,a batch based computing method is proposed to reduce the required memory resources.The application of vertical and horizontal computing mode in FC layer can significantly reduce the demand for on-chip buffer.The innovation and optimization of this design are as follows:(1)the implementation of large-scale convolutional neural network alexnet on XilinxVC709 achieves 391 throughput and 565.94 GOPS high performance,which is better than 194 throughput and 136.97 GOPS performance of FPGA;(2)the processing time of an image is 2.56 ms under peak performance,and the corresponding throughput is 391,which is better than that on CPU The running Caffe tool is78 times higher than that in stratix-v gsd8,7.82 times higher than that in catapult sever +Stratix V D5,2.92 times higher than that in catapult sever + Stratix VD5;(3)for processing an image,the energy consumption of this design is 4.2 times lower than that of GPU,329.9times lower than that of CPU.

Keywords/Search Tags:

CNN, AlexNet, hardware accelerator, pipeline, parallelism

PDF Full Text Request

Related items

1	To Accelerate The Forward Networks Of Alexnet Based On FPGA
2	Design Of Hardware Accelerator Based On FPGA For Convolutional Neural Networks
3	Implementation And Application Of Hardware Accelerator Based On Image Recognition Technology
4	Design Of BNN Hardware Accelerator Based On Pre-Calculation
5	Research On Implement Of DMC Controller With Hardware Accelerator
6	Design And Research Of Hardware Accelerator On H.264 Encoder Based On ARM ESL Platform
7	The Implementation Of High Performance Hardware Accelerator
8	A Convolutional Neural Network Accelerator For Limited Hardware Computing Resources
9	SIFT-Based Feature Extraction Hardware Accelerator For Image Matching Application
10	Design And Implementation Of Real-time Defogging Hardware Accelerator Based On Image Fusion