| Target detection,as one of the most central tasks of computer vision,has been widely used in face detection,remote sensing image detection,medical image detection and other applications.However,the high storage,high power consumption,and high computational complexity of target detection neural networks make the implementation of convolutional neural networks at the embedded end face great challenges.Currently,the global community is still facing the risk of widespread spread of various pathogenic viruses,and it is necessary to conduct efficient crowd mask wear detection in public occasions,and relying on advanced deep learning methods can effectively improve the detection efficiency,therefore,this paper will study in depth the embedded acceleration method for mask wear detection algorithm inference.In this paper,we compare various embedded-oriented mainstream network lightweighting methods and choose model pruning to achieve model lightweighting by cutting redundant parameters of the network;on the other hand,ZYNQ is chosen as the implementation platform of the hardware acceleration for mask detection based on the working characteristics of ZYNQ So C and the flexible programmability of FPGA.First,our design proposes an EAGP pruning algorithm based on progressive iterative pruning,which uses the pruning rate transformation function in the form of exponential function to achieve iterative pruning of the network,which can achieve network pruning more efficiently.The experimental results for Res Net network pruning show that the network accuracy is improved by 1.04% and 0.83%,respectively,compared with the AGP algorithm and the One-Cycle algorithm,and also has better performance for VGG16 network pruning.Secondly,the pruned network not only removes a large number of redundant parameters to achieve network sparsity,but also increases a certain percentage of zero-value weight parameters in the network model,thus this paper proposes an optimal design of pulsating array based on zero-value detection of weight branches,which reduces the power consumption of the system by effectively reducing the number of multiplications of the convolutional multiplier,taking the network model Res Net as an example,this paper reduces the multiplication of its convolutional multiplier The number of multiplications of the convolutional multiplier is reduced by 1.08%.Finally,this paper designs and implements inference acceleration for mask detection based on Res Net network model lightweighting on ZYNQ heterogeneous platform,which implements data transmission control and command control in PS side and computational acceleration of convolutional neural network as co-processing in PL side.The experimental results show that the overall power consumption of ZYNQ-based convolutional neural network Accelerator is 1.54 W while maintaining high inference accuracy,and achieves 15.3times and 1.9 times energy efficiency improvement compared with CPU and GPU computing platforms,respectively.The operational energy consumption of the pruned network to achieve accelerated inference on the ZYNQ side is reduced by 60.82% compared to that before pruning. |