The Research Of Deformable Convolution Layer Acceleration Based On Classic CNN Accelerator

Posted on:2022-02-25

Degree:Master

Type:Thesis

Country:China

Candidate:C Chu

Full Text:PDF

GTID:2518306560979349

Subject:IC Engineering

Abstract/Summary:

PDF Full Text Request

Deformable convolutional networks are widely used in vision tasks,and have shown good performance in various fields such as target detection,semantic segmentation,target classification,and video action detection.The traditional convolutional layer and deformable convolutional layer in deformable convolutional networks are the main sources of computational complexity.However,the existing neural network accelerators mainly focus on the optimization and acceleration tasks of the traditional convolutional layer,and there is little attention to the deformable convolutional layer.In the current research work,a relatively common method is to modify the algorithm to make the deformable convolution algorithm suitable for mapping on hardware,but this approach will more or less reduce the accuracy of the deformable convolution network.Therefore,there is currently no accelerator that can support a complete deformable convolution algorithm.In the research of this article,we have conducted an in-depth exploration of deformable convolution.The hardware acceleration of deformable convolution has been studied in depth,and the acceleration of the deformable convolutional layer is realized on the basis of the ReRAM-based convolutional neural network accelerator and the 1D-array convolutional neural network accelerator.For the ReRAM-based convolutional neural network accelerator,we selected ReRAM arrays of different sizes and ReRAM cells of different precisions.The ReRAM array can better support the operation of deformable convolution.At the same time,we propose a novel mapping method that performs bilinear interpolation.This method avoids the high latency and high power consumption caused by sequential writes of the ReRAM array,reduces the number of hardware operations,and can complete the dual in situ All calculations of linear interpolation.Then,this article modified the structure of the input buffer,which not only avoids the high-power four-port design,but also ensures that the computing array can get enough data.And according to the buffer structure,this dissertation designs the corresponding online index generation unit,and improves the area utilization rate through the block strategy.Finally,by redesigning the data flow,the parallelism of the accelerator to perform deformable convolution is improved.The final experimental results show that compared with the four platforms of CPU,GPU,ASIC+CPU and ISSAC+CPU,the design of this dissertation has achieved 227 times,15.1 times,26.8 times and 20.4 times improvement in performance respectively.In terms of energy consumption,the reduction was 225 times,17.4 times,32.9 times,and 38.6 times respectively.For the more widely used 1D-array convolutional neural network accelerator,this dissertation designs a 1D-array deformable convolutional neural network accelerator.According to the algorithm features of deformable convolution,we divide the execution steps of deformable convolution into three steps for processing,and by splicing and fusing each processing stage,we avoid repeated transmission of data and avoid expensive communication costs.Then a new address conversion unit is designed to convert the feature index into the index of the input buffer.At the same time,the address conversion unit can calculate the coefficients required for the bilinear interpolation calculation,which can convert the bilinear interpolation into a small convolution Operation.Finally,we deploy bilinear interpolation to a one-dimensional dot product array.Experimental results show that our design is significantly better than the other three hardware platforms in terms of implementing deformable convolutional neural networks.

Keywords/Search Tags:

ReRAM, DCN, Low Power, Accelerator

PDF Full Text Request

Related items

1	The Research Of Deformable Convolution Layer Acceleration Based On Classic CNN Accelerator
2	Study On The Failure Mechanism Of The Resistive Random Access Memory (ReRAM) In1T1R Architecture
3	The Research On Convolutional Neural Network Accelerator Based On In-memory Computing
4	Study And Design Of Power Supply System In DNB For EAST
5	Domestic Design Of FPGA Chip For Digital Controller Of Accelerator Power Supply
6	High-speed ECC Accelerator Resistant To Power Analysis Attack
7	Accelerator-based architectures for wireless sensor network applications
8	Research On Processing-in-memory Architecture For Neural Network Computation In ReRAM-based Main Memory
9	The Research And FPGA Implementation Of VOD Accelerator
10	A ReRAM Based Graph Traversal System With Low Communication Cost