Font Size: a A A

Convolution Neural Network Accelerator For General DSP

Posted on:2019-02-10Degree:MasterType:Thesis
Country:ChinaCandidate:R SongFull Text:PDF
GTID:2428330611993653Subject:Electronic Science and Technology
Abstract/Summary:PDF Full Text Request
With the development of artificial intelligence applications,the deep learning algorithm based on Convolution Neural Network(CNN)has become the core algorithm in the field of machine vision processing.However,in the face of the explosive growth of Internet data,the in-depth learning network has become increasingly complex.Traditional general-purpose processors can no longer meet the performance requirements.It is of great significance to study their hardware accelerators for the design of smart chips.DSP is oriented to the application of digital signal processing.It has the characteristics of low power consumption,high cost performance and good programmability.CNN accelerator integrated with special purpose based on general DSP architecture can give full play to the performance of general-purpose DSP.It can not only utilize the original programming environment of DSP,but also rapidly improve the acceleration performance of deep learning algorithm application of DSP.Fast reading extends the scope of DSP's intelligent computing.In this paper,a CNN accelerator is designed and implemented based on X-DSP architecture for image processing and video intelligent processing.The main work of the paper is as follows:Firstly,based on the analysis of the algorithm of convolutional neural network and the architecture and performance of X-DSP,the overall design structure of CNN accelerator is designed,and the performance of CNN accelerator operation unit and the size of on-chip cache are determined according to its external storage bandwidth.Theoretical analysis and on-chip cache design are carried out through parallelism analysis of convolution layer algorithm.Secondly,the maximum pooling and average pooling in the pooling layer are discussed,and the expansion of the operation unit function is proposed.Finally,the ReLU activation function is implemented to activate all the output eigenvalues completed by the convolution layer.Secondly,the scheme of software compression and hardware decompression is proposed.A descriptor-based control signal is proposed.Users can prepare the library functions provided by the DSP without any interference during the operation.When the accelerator is connected with DSP,the AXI bus protocol is adopted to realize the DSP oriented interface design.Finally,the RTL code of CNN accelerator is implemented by Verilog HDL,and the module level function verification and performance evaluation are completed.At the same time,according to the target frequency of 1 GHz,the design is synthesized logically based on the 40 nm standard cell library of a manufacturer,and the timing meets the requirements.
Keywords/Search Tags:AI Chip, CNN Accelerator, Compression, 2D Memory, buffer
PDF Full Text Request
Related items