Font Size: a A A

Research And Lightweight Implementation Of Face Detection In Unconstrained Scenes

Posted on:2021-06-10Degree:MasterType:Thesis
Country:ChinaCandidate:Z YangFull Text:PDF
GTID:2518306476950799Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
Deep learning has been the focus of worldwide attention in recent decades.Face detection research based on Convolutional Neural Networks has gradually superseded the traditional method of artificial templates,achieving more comprehensive and full development.The face detection method based on deep neural network advocates autonomously acquiring facial features,which can be roughly divided into two-stage method and one-stage method according to whether there is a region proposal process.The two-stage method performs classification and regression only after generating candidate regions.It is a coarse-to-fine process with high detection accuracy and is more suitable for multi-class scenarios such as plants and cars.Literally,face detection is a typical binary classification problem with certain requirements for real-time performance in practical applications.However,the two-stage method is relatively complicated and has relatively low operating efficiency.The one-stage method completes prediction in one step with a intuitive system deployment.It has fast detection speed and still has much room for improvement in accuracy,which means it has high scientific research value.Therefore,based on the one-stage method,this article will design special modules and strategies to polish the algorithm in order to address the issues of face detection research in unconstrained scenes,such as insufficient facial feature extraction and utilization,real-time performance and accuracy are difficult to balance.Here are the main innovations of this article:1.To cope with the problems of insufficient facial feature extraction and insufficient utilization,the Contextual Reasoning Face Detector has been proposed.This method uses Low-level Feature Pyramid Networks to weightedly fuse the features of different layers in order to extract more expressive descriptive information,and uses the Contextual Reasoning Prediction module to expand the subnet for deepening and widening the network model during the prediction process,which compensates for facial features that are not fully extracted.The data augmentation method of Adaptive Anchor Sampling and Multi-scale Model Training method are introduced to enhance the adaptability of the model to scales,thereby improving the utilization of facial features.The experimental verification reveals that the performance of this method on the two authoritative benchmarks of WIDER FACE and FDDB has been significantly improved compared with the prior methods with the same level model size.2.In view of the fact that the above method still has a single facial feature extraction mode for unconstrained faces,the Detection with Feature Strengthen and Progressive Cascade has been proposed.Besides focusing on contextual cues,this method pays attention to mining features of current layers via the Feature Strengthen Module to implement a dual-branch architecture.Correspondingly,the Progressive Loss function is designed to match the progressive learning ability of each branch and hierarchical feature map,which enriches the facial feature extraction mode.When facing the problem of abnormal sample distribution caused by dense sampling of small anchors,this method applies the Max-Both-Out strategy and simultaneously builds the Iterative Cascade Structure,where the threshold of Intersection over Union is set to increase gradually for a more appropriate sample distribution for each stage.Experimental results demonstrate that the accuracy of this method on the two authoritative benchmarks of WIDER FACE and FDDB surpasses the above method,and the performance has been considerably optimized.3.Taking industrial real-time algorithm models as the starting point,the Task-oriented and Lightweight Face Detector has been proposed.Considering that real-time and accuracy are difficult to achieve both,there is a tradeoff between the two.On the one hand,the lightweight backbone network is used to retain original features in a limited scale space as far as possible.On the other hand,Associated Anchors are introduced to generate the head and body information around the face in a semi-supervised manner to assist the detection of the target face.This method can alleviate the mutual interference between feature maps of different layers.Feature Integration Module is used to prevent high-level semantics from destroying low-level details and streamline calculation.Task-oriented Strategy is applied during inference to divide and conquer classification and regression,which avoids low-level feature maps with insufficient discrimination participating in location regression,thus realizing the high-efficiency of the algorithm model.It has been experimentally identified that this method currently reaches the advanced level on the two authoritative benchmarks of WIDER FACE and FDDB,especially in the detection of medium and low difficulty faces.The inference speed of the algorithms described in this paper has reached the real-time,even industrial-grade real-time standards.At the end of the paper,the deficiencies of these algorithms are pointed out,and directions for further deepening are listed.
Keywords/Search Tags:Face Detection, One-Stage Method, Contextual Reasoning, Progressive, Lightweight
PDF Full Text Request
Related items