Research On Hardware Acceleration Of 3D Convolutional Neural Network Algorithm Based On DSP

Posted on:2021-02-10

Degree:Master

Type:Thesis

Country:China

Candidate:W W Chen

Full Text:PDF

GTID:2518306548493484

Subject:Electronic Science and Technology

Abstract/Summary:

PDF Full Text Request

With the development of deep learning research in recent years,deep neural network algorithms have achieved great success in image processing,natural language processing,speech recognition,and so on.The three-dimensional convolutional neural network is a branch of the deep neural network.Compared with ordinary neural networks,it is suitable for higher-dimensional or more complex data processing,such as video classification,medical image segmentation,point cloud data processing,etc.The efficiency and accuracy of the field have been confirmed.The excellent performance of the 3D convolutional neural network has been recognized,but its huge calculation and data amount also limit its popularization and application.So the study of hardware acceleration methods for 3D networks has become an inevitable trend.FT-DSP is a domestically developed digital signal processor(DSP)chip.It has the characteristics of high performance,programmability,and support for scalar and vector operations.It integrates algorithms such as fast Fourier transform and matrix multiplication.Previous studies have shown that the structural characteristics of DSP are very suitable for convolution calculations.Under this background,in order to expand the application scenarios of domestic DSPs and improve the domestic DSP architecture to adapt to the future era of intelligent computing,this paper closely combines the characteristics of FT-DSP architecture and systematically studies the vectorization mapping methods of different levels such as convolution,pooling and fully connected layers in 3D convolutional neural networks,and design efficient data access methods of weight parameters accordingly.Based on the concept of hardware acceleration of deep learning algorithms,this paper implements the mapping programming and optimization method of complete threedimensional convolutional neural network algorithm on DSP.During the research process,we have always focused on high efficiency,general purpose,software and hardware collaboration,and combination of algorithm and structure,to make full use of the computing,storage and transmission resources in the hardware as much as possible to achieve the acceleration effect.The main work and innovations of this article include:1.Based on the parallelism of the FT-DSP chip structure mining algorithm,a vectorized mapping method is proposed for the most important three-dimensional convolution calculation in the algorithm.And the convolution calculation is innovatively converted into a vector operation,making full use of the scalar vector calculation in DSP In addition,for a large number of weight parameters involved in the calculation process,a packet access scheme is designed.The experimental results show that the threedimensional convolutional layers of different sizes can achieve better acceleration effects in FT-DSP.2.Propose the corresponding mapping methods for the pooling layer,fully connected layer,Relu layer and Padding layer in the 3D convolutional network,and convert the operations such as maximum value and matrix multiplication into vector mode.3.For the optimization mapping of the convolutional layer,a vectorization implementation method based on the Winograd algorithm is proposed,and the inexpensive addition calculation is replaced by the matrix transformation to replace the expensive multiplication calculation,so as to realize the optimization based on the algorithm level,which can effectively reduce the total number of calculations And calculation time.

Keywords/Search Tags:

Three-Dimensional Convolutional Neural Network, Digital Signal Processor, Winograd Algorithm, Hardware Acceleration

PDF Full Text Request

Related items

1	Design Of Neural Network Accelerator In Multiple Convolutional Modes
2	Research On Acceleration Of Convolutional Neural Networks On FPGA Based On OpenCL
3	The Design Of RISC-V Processor Suitable For Accelerated Convolutional Neural Network On The Edge Side Of The Internet Of Things
4	Design Of Convolutional Neural Network Acceleration System Based On Open Source RISC-? Processor
5	Research On Hardware Acceleration Based On FPGA Of Convolutional Neural Network And Elliptic Curve Algorithm
6	Research On CNN Network Acceleration For Image Classification Based On FPGA
7	Zynq-based Convolutional Neural Network Embedded Acceleration System Design
8	A Convolutional Neural Network Accelerator For Limited Hardware Computing Resources
9	Hardware Accelerator Design Of Convolutional Neural Networks For Low Power And High Performance
10	Acceleration System Design And Implement For Convolutional Neural Network Based On SOC FPGA