Study And Implementation Of FPGA-based Real Time Multiple Face Detection System

Posted on:2021-05-10

Degree:Master

Type:Thesis

Country:China

Candidate:H J Xu

Full Text:PDF

GTID:2428330611966406

Subject:Microelectronics and Solid State Electronics

Abstract/Summary:

PDF Full Text Request

Compared with the traditional AdaBoost algorithm,the deep learning-based face detection algorithm can obtain lower false rates and higher recall rates in complex environments.Face detection algorithm has been widely used in intelligent security,passenger flow statistics,mobile device authentication and other fields.However,the edge computing power is limited,and it is difficult to deploy a large-scale face detection neural networks in real time.In order to solve this problem,a small and high-precision face detection algorithm was designed based on Mobile NetV2-SSDlite in this thesis.To improve the computing capability of mobile terminals,the study of the mobile algorithm specific neural network accelerator is focused on on-chip memory,performance and power consumption.The main work is as follows:(1)MobileNetV2-SSDlite is analyzed,the size and scale of anchors adjusted to propose a small-scale,high-precision mobile single-shot multi-face detection algorithm.And the algorithm is trained and evaluated on the WIDER FACE dataset.The evaluation results show that the mean average precision of the algorithm in easy,medium,hard samples can achieve 0.897,0.857,0.565 respectively.And the inference time and detection rate are better than the multi-task convolutional neural network(MTCNN)in detecting multiple faces.(2)For the needs of low power consumption and real-time performance,8-bit quantized compression is applied.At the same time,the post-processing of the algorithm and the Single Shot Multi Box Detector(SSD)are quantized in 32-bit fixed-point to reduce on-chip memory and power consumption.(3)Aiming at the linear bottleneck structure,three improvements are done: a dedicated convolutional neural accelerator structure is proposed to reduce the off-chip bandwidth.The design can be reconfigured with an addition tree to improve the efficiency of the 1?1 pointwise convolution operation.The configurable number of optional accelerator cores can balance resources and speed,speeding up inference operations.Based on the above work,a software-hardware interaction system is built,and the accelerator is controlled by instructions to perform convolutional neural network operations.(4)Real-time multi-face detection accelerator system is simulated and implemented on Xilinx XC7Z035-2FFG676 FPGA.The results show that the face detection system in this thesis has an average computing performance of 56.01 GOPS and a power consumption of 7.3W under a 100 MHz clock.Compared with the Kirin 970 processor,it can achieve 2.66 times acceleration.Compared with the existing Mobile NetV2-SSDlite accelerator,the operation speed is increased by 28.7%,the resources are reduced by 44.46%,and the power consumption is reduced by 26.3%.The speed of the system in inferencing a 224?224 resolution picture reaching 83.4 FPS and the mean average precision of the quantized algorithm can be 0.825,0.717,0.338 respectively.

Keywords/Search Tags:

Face Detection, MobileNetV2 Network, Hardware Accelerator, FPGA

PDF Full Text Request

Related items

1	Design And Implementation Of Lightweight Convolutional Neural Network Accelerator On SoPC
2	Implementation And Application Of Hardware Accelerator Based On Image Recognition Technology
3	Research On Scalable Accelerator Design For Face Detection And Recognition Application
4	The Algorithm Design And FPGA Verification Of Face Detection And Recogniton Based On 8Bit Quantization Neural Network
5	Design Of Hardware Accelerator Based On FPGA For Convolutional Neural Networks
6	Design And Research Of Convolutional Capsule Network Accelerator Based On FPGA
7	Research Of Scalability On FPGA-based Neural Network Accelerator
8	Research On Implement Of DMC Controller With Hardware Accelerator
9	Design And Optimization Of Configurable Hardware Accelerator For LSTM Neural Network
10	Research And Implementation Of High-speed Object Detection Network Based On FPGA Accelerator