Font Size: a A A

Design And Implementation Of Energy-Efficient Configurable Convolution Accelerator For CNN

Posted on:2020-05-21Degree:MasterType:Thesis
Country:ChinaCandidate:M XuFull Text:PDF
GTID:2428330590958176Subject:Microelectronics and Solid State Electronics
Abstract/Summary:PDF Full Text Request
Image recognition is one of the most important parts of computer vision,which plays an important role in industrial Internet of Things.The convolutional neural network algorithm has become the first choice for image recognition because of its high recognition accuracy.In many image recognition applications,due to the real-time requirements,the neural network's inference needs to be performed on local hardware,such as collision detection of drones,etc.And the power consumption of the GPU is too high apply on embedded platforms with limited power and resources.Therefore,an energy-efficient and reconfigurable convolution accelerator is needed to satisfy Application requirements of convolutional neural network algorithm on the embedded platform.Based on the convolutional layer computing model,this paper analyzes the energy consumption of different storage structures and multiple data multiplexing methods in convolutional layer operations.And combined with the two-dimensional processing element array,A convolution accelerator architecture for convolutional layer operations is proposed.First,a configurable processing element array is used to achieve a high degree of matching with the convolutional mapping,which significantly improves the utilization of the processing element.Then,a processing element with local storage and supporting interleaving operation is designed,which realizes the multiplexing of the input feature map and the accumulation of intermediate results,and significantly reduces the off-chip storage access to the input feature map and the convolution kernel data.Finally,based on the tree network,a dedicated data stream processing is designed.The on-chip network realizes data sharing between the input feature map and the convolution kernel of the processing element array,and balances the data transmission between the processing element array and the global buffer.This paper firstly implements the system based on the hardware description language and completes the function simulation through Modelsim.Then,based on the Vivado platform,the FPGA is verified.Finally,based on the TSMC 90 nm CMOS process,DC is used to complete the synthesis,EDI is used to complete the layout and routing,and the overall layout is obtained.By implementing the convolutional layer of the classic image recognition network AlexNet,the peak throughput of the system can reach 15.6GMACS at a clock frequency of 200 MHz,and the average utilization rate of the processing element can reach 94%.And the convolution accelerator of this paper has good configurability and can adapt to many different types of convolution layer operations.
Keywords/Search Tags:Convolutional neural network, Energy-efficient accelerator, ASIC, Dataflow processing
PDF Full Text Request
Related items