Font Size: a A A

The Design And FPGA Verification Of A CNN Accelerator With Depthwise Separable Convolutions

Posted on:2022-01-08Degree:MasterType:Thesis
Country:ChinaCandidate:J J SuFull Text:PDF
GTID:2518306740493784Subject:IC Engineering
Abstract/Summary:PDF Full Text Request
In recent years,in order to extend the reach of deep learning applications to platforms that are more resource and energy-constrained,more compact network model structures,such as depthwise separable convolution,have been proposed and applied to various lightweight networks.However,due to the irregularity of the compact network structure,the reduction in the amount of calculation and parameter caused by this algorithm optimization cannot be effectively utilized by most accelerators today.For this reason,designing a flexible and efficient accelerator that supports depthwise separable convolution will give the accelerator an inherent advantage in energy efficiency and processing speed.Firstly,the representative methods of neural network compression algorithms and hardware accelerators are reviewed in this thesis.Secondly,the Ping-Pong mechanism,the output feathure map row stationary dataflow and the 3-D NoC data path are proposed.The Ping-Pong mechanism effectively hides the data transmission time in the calculation time,the demand for off-chip data bandwidth is further reduced;the output feathure map row stationary dataflow can support multiple convolution types and explore different types of data reuse as much as possible;the 3-D NoC can be flexibly configured according to the bandwidth requirements of different convolution types.The above three methods can effectively ensure that the hardware accelerator runs different convolutional networks efficiently.Finally,an accelerator based on the FPGA hardware platform is designed to verify the execution efficiency of the accelerator for different operations such as standard convolution,channel separation convolution,and full connection.Based on the Xilinx ZYNQ7100 FPGA development board,a convolutional neural network accelerator that supports depthwise separable convolution.The experimental results using 8bit Mobile Net as the test model show that under the operating frequency of 150 MHz,the actual calculation throughput rate of the accelerator is 203.5GOPS,and the operating frame rate is 187.22 fps.The accelerator designed in this thesis can efficiently support different convolution types,has better flexibility,and has faster processing speed and better real-time performance.
Keywords/Search Tags:Convolutional Neural Network Accelerator, Depthwise Seperable Convolution, Output Feature Map Row Stationary Dataflow, 3-D NoC
PDF Full Text Request
Related items