Extended Programmable Neural Network Acceleration System Of Cortex-M3

Posted on:2020-06-10

Degree:Master

Type:Thesis

Country:China

Candidate:K G Yang

Full Text:PDF

GTID:2428330602450401

Subject:Microelectronics and Solid State Electronics

Abstract/Summary:

PDF Full Text Request

With the improvement of technology,computing capability has increased with remarkable rapidity.Artificial intelligence based on neural network has been widely used in image processing,control system,pattern recognition,financial management and other fields.Due to the dependence of deep learning on computing capability,most neural networks are now based on CPU or GPU for training and inference.However,with the change of network structure and the improvement of real-time computing requirements,the traditional implementation methods will be difficult to satisfy the application in the future.Therefore,many special acceleration circuits have appeared in recent years.So,ASIC is necessary for deep learning.According to the application scenarios of acceleration circuit,it can be divided into two parts,the server cloud and the edge computing.The lightweight acceleration circuit for embedded terminal is one of the trends of development.Therefore,this paper uses the Cortex-M3 processor IP provided by the Designstart plan of ARM company,and designs an ARM system-on-chip with integrated a programmable neural network acceleration unit for embedded terminals.The main work of this paper is as follows.?1?The development and current situation of neural networks are introduced,and the advantages and disadvantages of software and hardware implementation of neural networks are compared.The inference and backward propagation of BP neural network algorithm are described emphatically,and the acceleration circuit of BP neural network is designed according to the process of inference.?2?By comparing and analyzing the advantages and disadvantages of various fitting methods of neural network activation function,it is determined that 6-segment linear functions with a slope of 2^-n are used to fit activation function,so that division operation can be replaced by shift operation,which reduces the computational complexity when the accuracy loss of handwritten digit recognition is less than 0.2%.?3?According to the characteristics of the neural network parameters in the process of inference,the distributed buffer and dynamic ping-pong buffer are designed to optimize the storage system structure of the acceleration system.The requirement of write bandwidth for external interface is reduced by 93.75%and the parallel operations of network acceleration and data reading and writing is supported.?4?The system integration of neural network accelerator and Cortex-M3 processor IP is realized by using system-on-chip design method,which enables the neural network accelerator to adapt feedforward inference acceleration with different topologies through processor software programming.?5?Based on the acceleration system designed in this paper,a handwritten digit recognition neural network is implemented on the FPGA platform.Compared with the simulation test results of matlab and C,the power consumption of the acceleration system designed in this paper is 2.8 W,and it is nearly 10 times faster than that of the CPU.The extended programmable neural network acceleration system of Cortex-M3 designed in this paper combines the processor and acceleration peripherals to make the whole system have low power consumption,low bandwidth requirements,high parallelism,and configurable network structure.It basically meets the design goal of real-time inference for neural networks in the embedded application scenario.

Keywords/Search Tags:

neural network, SoC, ARM, FPGA

PDF Full Text Request

Related items

1	Design And Implementation Of Feedforward Neural Network And Particle Swarm Optimization Based On FPGA
2	FPGA Architecture Development Based On Neural Network
3	Research On FPGA Acceleration Of Neural Network Algorithm
4	A Convolutional Neural Network Accelerator Based On FPGA
5	Design And Implementation Of Convolution Neural Network Based On FPGA
6	FPGA-based Convolutional Neural Network Application
7	Construction And Analysis Of Pulsed Neural Network Based On FPGA
8	The FPGA Programmable Neural Network Processor Design
9	Neural Network Analysis And Multi-FPGA Implementation
10	Research And Implementation Of Neural Network Controller Based On FPGA