High Performance CNN Accelerator Based On Vector-systolic Array

Posted on:2023-09-11

Degree:Master

Type:Thesis

Country:China

Candidate:L S Pan

Full Text:PDF

GTID:2558307061951989

Subject:Integrated circuit engineering

Abstract/Summary:

PDF Full Text Request

The volume of image data has been expanding quickly as artificial intelligence has progressed and mobile intelligent terminal equipment has become more widespread.Traditional manual interpretation methods have failed to suit image recognition’s actual needs.Convolutional neural networks(CNNs),as an emerging implementation of image recognition in the field of artificial intelligence,have made major breakthroughs in the artificial intelligent tasks.There are more and more scholars conducting research based on CNN.An emerging trend and practical application in both academia and industry is the application of convolutional neural networks to edge devices such as smartphones,drones,and artificial intelligence of things(AIo T)devices.However,the edge device usually operates at an environment with limited resource and power,where high performance yet low energy dissipation are strongly desired.So it is of great significance to improve the P(Performance)P(Power)A(Area)of the CNN circuit.Systolic array has been the crucial architecture for accelerating CNN since the success of Google’s TPU(Tensor Processing Unit).However,the traditional systolic array requires complex peripheral circuits to guide the fine-grained input feature and weight arriving at the designated procession element,and its loading/offloading delay are usually large.In this work,we propose high throughput and low delay dual-line-systolic array to accelerate the CNN.With the line-by-line vector-style systolic dataflow,the peripheral circuit was well simplified and the loading/offloading delays were greatly reduced.Compared with the traditional neural network accelerator CPU and GPU,Field Programmable Gate Array(FPGA)has the advantages of small size,low power consumption,high parallel computing capabilities,and low requirements for hardware platform configuration.FPGA acts as a hardware acceleration platform for CNN to implement acceleration strategies.Besides,to fully take advantage of the DSP(Digital signal processor)INT8 computation in FPGA,dual-line-systolic array is developed,by which the computation throughput can be doubled.Finally,the proposed accelerator is deployed on PYNQ-Z2 for practically accelerating VGG16 neural network,peek throughput of the convolution layer can reach as high as 107.21 GOPS,which has exceeded all of the previous works on the same hardware platform.

Keywords/Search Tags:

Convolutional Neural Network, Systolic Array, FPGA CNN Accelerator

PDF Full Text Request

Related items

1	Research On Systolic Array Based Hardware Accelerator For Convolutional Neural Networks
2	Reliability Analysis And Optimization Study For Systolic Array Based Accelerator
3	An Acceleration Structure Of Convolutional Neural Network
4	Research And Implementation Of High-speed Object Detection Network Based On FPGA Accelerator
5	VLSI Architecture Design For Binary Convolutional Neural Network Accelerator
6	Design And Optimization Of Convolution Array Accelerator Based On FPGA
7	Research On Scheduling Strategy Of Multi-core Convolutional Neural Network Accelerator Based On FPGA
8	Design Of FPGA Accelerator For Radar Emitter Recognition Based On Improved CNN
9	FPGA-Based Reconfigurable CNN Accelerator Design
10	Research On Convolutional Neural Network Accelerator Based On FPGA