Design And Implementation Of Energy-Efficient Configurable Convolution Accelerator For CNN

Posted on:2020-05-21

Degree:Master

Type:Thesis

Country:China

Candidate:M Xu

Full Text:PDF

GTID:2428330590958176

Subject:Microelectronics and Solid State Electronics

Abstract/Summary:

PDF Full Text Request

Image recognition is one of the most important parts of computer vision,which plays an important role in industrial Internet of Things.The convolutional neural network algorithm has become the first choice for image recognition because of its high recognition accuracy.In many image recognition applications,due to the real-time requirements,the neural network's inference needs to be performed on local hardware,such as collision detection of drones,etc.And the power consumption of the GPU is too high apply on embedded platforms with limited power and resources.Therefore,an energy-efficient and reconfigurable convolution accelerator is needed to satisfy Application requirements of convolutional neural network algorithm on the embedded platform.Based on the convolutional layer computing model,this paper analyzes the energy consumption of different storage structures and multiple data multiplexing methods in convolutional layer operations.And combined with the two-dimensional processing element array,A convolution accelerator architecture for convolutional layer operations is proposed.First,a configurable processing element array is used to achieve a high degree of matching with the convolutional mapping,which significantly improves the utilization of the processing element.Then,a processing element with local storage and supporting interleaving operation is designed,which realizes the multiplexing of the input feature map and the accumulation of intermediate results,and significantly reduces the off-chip storage access to the input feature map and the convolution kernel data.Finally,based on the tree network,a dedicated data stream processing is designed.The on-chip network realizes data sharing between the input feature map and the convolution kernel of the processing element array,and balances the data transmission between the processing element array and the global buffer.This paper firstly implements the system based on the hardware description language and completes the function simulation through Modelsim.Then,based on the Vivado platform,the FPGA is verified.Finally,based on the TSMC 90 nm CMOS process,DC is used to complete the synthesis,EDI is used to complete the layout and routing,and the overall layout is obtained.By implementing the convolutional layer of the classic image recognition network AlexNet,the peak throughput of the system can reach 15.6GMACS at a clock frequency of 200 MHz,and the average utilization rate of the processing element can reach 94%.And the convolution accelerator of this paper has good configurability and can adapt to many different types of convolution layer operations.

Keywords/Search Tags:

Convolutional neural network, Energy-efficient accelerator, ASIC, Dataflow processing

PDF Full Text Request

Related items

1	The Convolutional Neural Network Accelerator Research Based On The Tiling Dataflow
2	Design And Research Of Sparse Convolutional Neural Network Accelerator On FPGAs
3	The Design And FPGA Verification Of A CNN Accelerator With Depthwise Separable Convolutions
4	An Efficient General Accelerator For Convolutional Neural Network
5	Research On Heterogeneous Reconfigurable Dataflow Accelerator For Big Data Applications
6	Scalable And Energy-efficient Cnn Accelerator Design Based On Dynamic Accuracy
7	Research And Design For High Performance Cnn Hardware Accelerator
8	Efficient And Reconfigurable Deep Convolutional Neural Network Acceleration System With 3D Stacked Memory
9	Design Of General-purpose Convolutional Neural Network Accelerator Based On FPGA
10	Design And Implimentation Of Energy-Efficient Binary Neural Network Accelerator