Font Size: a A A

The Research Of Deformable Convolution Layer Acceleration Based On Classic CNN Accelerator

Posted on:2022-02-25Degree:MasterType:Thesis
Country:ChinaCandidate:C ChuFull Text:PDF
GTID:2518306560979349Subject:IC Engineering
Abstract/Summary:PDF Full Text Request
Deformable convolutional networks are widely used in vision tasks,and have shown good performance in various fields such as target detection,semantic segmentation,target classification,and video action detection.The traditional convolutional layer and deformable convolutional layer in deformable convolutional networks are the main sources of computational complexity.However,the existing neural network accelerators mainly focus on the optimization and acceleration tasks of the traditional convolutional layer,and there is little attention to the deformable convolutional layer.In the current research work,a relatively common method is to modify the algorithm to make the deformable convolution algorithm suitable for mapping on hardware,but this approach will more or less reduce the accuracy of the deformable convolution network.Therefore,there is currently no accelerator that can support a complete deformable convolution algorithm.In the research of this article,we have conducted an in-depth exploration of deformable convolution.The hardware acceleration of deformable convolution has been studied in depth,and the acceleration of the deformable convolutional layer is realized on the basis of the ReRAM-based convolutional neural network accelerator and the 1D-array convolutional neural network accelerator.For the ReRAM-based convolutional neural network accelerator,we selected ReRAM arrays of different sizes and ReRAM cells of different precisions.The ReRAM array can better support the operation of deformable convolution.At the same time,we propose a novel mapping method that performs bilinear interpolation.This method avoids the high latency and high power consumption caused by sequential writes of the ReRAM array,reduces the number of hardware operations,and can complete the dual in situ All calculations of linear interpolation.Then,this article modified the structure of the input buffer,which not only avoids the high-power four-port design,but also ensures that the computing array can get enough data.And according to the buffer structure,this dissertation designs the corresponding online index generation unit,and improves the area utilization rate through the block strategy.Finally,by redesigning the data flow,the parallelism of the accelerator to perform deformable convolution is improved.The final experimental results show that compared with the four platforms of CPU,GPU,ASIC+CPU and ISSAC+CPU,the design of this dissertation has achieved 227 times,15.1 times,26.8 times and 20.4 times improvement in performance respectively.In terms of energy consumption,the reduction was 225 times,17.4 times,32.9 times,and 38.6 times respectively.For the more widely used 1D-array convolutional neural network accelerator,this dissertation designs a 1D-array deformable convolutional neural network accelerator.According to the algorithm features of deformable convolution,we divide the execution steps of deformable convolution into three steps for processing,and by splicing and fusing each processing stage,we avoid repeated transmission of data and avoid expensive communication costs.Then a new address conversion unit is designed to convert the feature index into the index of the input buffer.At the same time,the address conversion unit can calculate the coefficients required for the bilinear interpolation calculation,which can convert the bilinear interpolation into a small convolution Operation.Finally,we deploy bilinear interpolation to a one-dimensional dot product array.Experimental results show that our design is significantly better than the other three hardware platforms in terms of implementing deformable convolutional neural networks.
Keywords/Search Tags:ReRAM, DCN, Low Power, Accelerator
PDF Full Text Request
Related items