Font Size: a A A

Design And Implementation Of Convolutional Neural Network Accelerator

Posted on:2022-03-04Degree:MasterType:Thesis
Country:ChinaCandidate:L Y XuFull Text:PDF
GTID:2518306602966589Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
In recent years,convolutional neural networks have been widely used in many fields such as image recognition and intelligent medical.People usually implement convolutional neural networks on CPU(Central Processing Unit)or GPU(Graphics Processing Unit)platform.The low speed of the CPU and the high power consumption of the GPU cannot support the application of convolutional neural networks on the mobile side.Therefore,under the premise of ensuring accuracy,balancing speed and power consumption is the key to the application of current convolutional neural networks to mobile terminals.The operating speed of FPGA(Field Programmable Gate Array)hardware platform has a great advantage over CPU.In terms of power consumption,the advantage is far greater than the GPU.And FPGA has obvious advantages in parallel computing,so this design chooses FPGA as the hardware implementation platform of convolutional neural network.Classical convolutional neural networks such as VGGNet and Res Net cannot meet the requirements of low power consumption and high speed due to their huge computational load.Therefore,this design uses Ghost Net as the network model of this design.Ghost Net uses Ghost Module instead of standard convolution for feature extraction.This method can greatly reduce the amount of parameters and calculations.Based on the above background,this article implements Ghost Net based on FPGA platform to explore the feasibility of convolutional neural network application on the mobile side.According to the structural characteristics of Ghost Net's conventional convolution,Depthwise convolution and Pointwise convolution and the parallel computing characteristics of FPGA,the paper studies and proposes a dedicated parallel scheme for these three convolution operations.The paper combines the special parallelism scheme and operation process of the three convolution operations,and designs the integrated conventional convolution module,the Pointwise convolution module and the integrated Depthwise convolution module in a modular manner.At the same time,it analyzes the operating characteristics of the Ghost Net feature extraction part,and divides the operation state.The main control module is realized by a state machine to maximize the call of the convolution module.Through the streamlining operation of the convolution module in each state,the utilization rate and operating frequency of the convolution module are improved.In this subject,the function of the accelerator is simulated and verified at the three levels of sub-arithmetic unit,convolution operation and system timing control to ensure the correctness of its function.Use the GPU platform to complete the weight training of Ghost Net based on the Pytorch framework,and save the weights for subsequent hardware implementation when the training accuracy is the highest.This topic uses the ZCU104 hardware platform to implement the Ghost Net convolutional neural network,and compares the power consumption and speed of Ghost Net implemented on the Arm Cortex A53 CPU platform and the NVIDIA Ge Force RTX 2080 GPU platform.The implementation of Ghost Net based on FPGA platform can reach 2.926 W at a clock frequency of 200 MHz,and its speed is 1/14 times that of GPU,but its power consumption is 1/80 of using GPU platform to implement the network.Its speed is 8 times that of using the CPU platform.Therefore,using the FPGA platform to implement Ghost Net can achieve the maximum costeffectiveness of power consumption and speed.At the same time,the conventional convolution and Ghost Module are used to implement a single-layer convolutional neural network,and the speed difference is compared.Under the premise of sacrificing a certain number of feature extraction,the speed of convolution extraction using Ghost Module is7.23 times that of conventional convolution.This design has obvious advantages in power consumption and speed,and is suitable for the application of convolutional neural networks in mobile terminals.
Keywords/Search Tags:FPGA, cnvolutional neural network, hardware acceleration, image recognition
PDF Full Text Request
Related items