Convolution Neural Network Accelerator For General DSP

Posted on:2019-02-10

Degree:Master

Type:Thesis

Country:China

Candidate:R Song

Full Text:PDF

GTID:2428330611993653

Subject:Electronic Science and Technology

Abstract/Summary:

PDF Full Text Request

With the development of artificial intelligence applications,the deep learning algorithm based on Convolution Neural Network(CNN)has become the core algorithm in the field of machine vision processing.However,in the face of the explosive growth of Internet data,the in-depth learning network has become increasingly complex.Traditional general-purpose processors can no longer meet the performance requirements.It is of great significance to study their hardware accelerators for the design of smart chips.DSP is oriented to the application of digital signal processing.It has the characteristics of low power consumption,high cost performance and good programmability.CNN accelerator integrated with special purpose based on general DSP architecture can give full play to the performance of general-purpose DSP.It can not only utilize the original programming environment of DSP,but also rapidly improve the acceleration performance of deep learning algorithm application of DSP.Fast reading extends the scope of DSP's intelligent computing.In this paper,a CNN accelerator is designed and implemented based on X-DSP architecture for image processing and video intelligent processing.The main work of the paper is as follows:Firstly,based on the analysis of the algorithm of convolutional neural network and the architecture and performance of X-DSP,the overall design structure of CNN accelerator is designed,and the performance of CNN accelerator operation unit and the size of on-chip cache are determined according to its external storage bandwidth.Theoretical analysis and on-chip cache design are carried out through parallelism analysis of convolution layer algorithm.Secondly,the maximum pooling and average pooling in the pooling layer are discussed,and the expansion of the operation unit function is proposed.Finally,the ReLU activation function is implemented to activate all the output eigenvalues completed by the convolution layer.Secondly,the scheme of software compression and hardware decompression is proposed.A descriptor-based control signal is proposed.Users can prepare the library functions provided by the DSP without any interference during the operation.When the accelerator is connected with DSP,the AXI bus protocol is adopted to realize the DSP oriented interface design.Finally,the RTL code of CNN accelerator is implemented by Verilog HDL,and the module level function verification and performance evaluation are completed.At the same time,according to the target frequency of 1 GHz,the design is synthesized logically based on the 40 nm standard cell library of a manufacturer,and the timing meets the requirements.

Keywords/Search Tags:

AI Chip, CNN Accelerator, Compression, 2D Memory, buffer

PDF Full Text Request

Related items

1	A Convolutional Neural Networks Accelerator Based On Parallel Memory Technology
2	Researches On Non-volatile Memory Based Power Optimization For Network-on-Chip Router Buffer
3	An Accelerator Platform Based On FPGA For Clustering Algorithms
4	Design Of FPGA Convolution Neural Network Accelerator Based On HLS
5	The Memory Management And Performance Optimization Of Caffe On The Master-slave Accelerator
6	Research On Memory Bus Width Aware Compression Technology Of Image Super-resolution Model Algorithm Based On FPGA
7	Research On The Key Technology Of Router Buffer For NoC
8	Research And Implementation Of High-speed LTE Communication Based On Single Chip Platform
9	The Study Of Voltage-Controlled MTJ Enabled AI Accelerator
10	Research On Key Technology Of Sparse Recurrent Neural Network Customized Accelerator