A Convolutional Neural Network Accelerating Circuit Design And FPGA Implementation

Posted on:2020-08-23

Degree:Master

Type:Thesis

Country:China

Candidate:B F Liu

Full Text:PDF

GTID:2428330626450802

Subject:Integrated circuit engineering

Abstract/Summary:

PDF Full Text Request

Convolutional neural networks(CNNs)have achieved great success in image processing by modeling the behavior of optic nerves in living creatures,and are widely used in image classification,machine vision,pattern recognition and other fields.As a computationally intensive algorithm,CNNs have specific computation pattern that makes it difficult to implement efficient CNN applications based on general-purpose processors.As a common hardware acceleration method,FPGA can map complex algorithms to its internal configurable hardware resources to achieve parallel computation,providing a new idea for the deployment of convolutional neural networks on embedded devices.Based on the in-depth analysis of the CNN's computing model,this paper proposes a Zynq-based CNNs acceleration system based on the software and hardware codesign methodology.The acceleration system is mainly composed of a configurable hardware accelerator based on Xilinx Artix-7 FPGA and a software processing system based on ARM Cortex-A9 CPU.The main work of this thesis includes:(1)A configurable hardware acceleration circuit was designed and implemented based on high-level synthesis technology.For its computing engine,parallel computation and pipeline optimization were used to realize computational acceleration.For its on-chip cache system,a memory partition strategy was adopted to match communication bandwidth and computation throughput.For its control logic,a global register list was designed to save parameters and control the whole accelerator.(2)The acceleration circuit was integrated on an ARMcentered SOC system using Vivado IDE,and a DMA+ AXI4-Stream based communication scheme was adopted to implement data communication between PS side and PL side.(3)A software processing system used for fast deployment of pretrained Caffe models on our accelerator was designed based on Pynq framework,and a parameterized user interface was provided by encapsulating the underlying DMA driver.In this paper,a typical CNN model used for handwritten digital recognition was selected to test and verify this accelerator on Xilinx Pynq-Z1 evaluation board.The experimental results show that the computation speed of the accelerator running the handwritten digit recognition network at the working frequency of 100 MHz can reach 22.65 FPS,which makes the accelerator achieve at least 25.9 times speed up compared with ARM Cortex-A9 CPU working at 650 MHz.The average power of the accelerator is only 1.59 W.In summary,the accelerator designed in this paper achieved a good acceleration effect for CNN applications compared with CPU,and the power consumption keeps at a low level.In addition,the accelerator is well configurable and enables rapid deployment of CNN applications on embedded devices or mobile terminals.

Keywords/Search Tags:

Hardware Accelerator, Convolutional Neural Networks, FPGA, SOC, Artificial Intelligence

PDF Full Text Request

Related items

1	Design Of Hardware Accelerator Based On FPGA For Convolutional Neural Networks
2	Design And Implementation Of A High-performance Accelerator Dedicated For Convolutional Neural Networks
3	Research On Neural Network Accelerator Based On Embedded Platform
4	Design And Implementation Of Convolutional Neural Network Hardware Accelerator
5	Implementation And Application Of Hardware Accelerator Based On Image Recognition Technology
6	Research On FPGA-based Accelerator Design For Convolutional Neural Networks
7	Research And Design Of Convolutional Neural Network Accelerator Based On Multi-FPGA Co-acceleration
8	Design Of General-purpose Convolutional Neural Network Accelerator Based On FPGA
9	Research On Convolutional Neural Networks Accelerator Based On FPGA
10	Research Of Scalability On FPGA-based Neural Network Accelerator