| With the popularity of artificial intelligence,intelligence robot,unmanned systems and intelligent video supervise system are gradually mature and incorporated into human daily lives.As the key technology of identity recognition,face detection is the theoretical basis and technical support of these areas.The goal of face detection is to prevent against miss or redundancy.However,face detection in the wild remains an open research challenge especially when detecting faces at vastly different scales and characteristics.The recall of detection result is not satisfactory for the sake of low-resolution,blur and occlusion.Traditional face detection technology is not enough to solve these problems effectively.While face detection model based on deep learning needs to solve the problem of balance between efficiency and accuracy.Therefore,This thesis presents YOMO,a new approach to real-time face detection,which frames object detection as a regression problem to spatially separated bounding boxes and associated class probabilities.A series of strategies such as Optimization of model,data augmentation and feature fusion,are applied to YOMO for the sake of detecting faces at vastly different scales and characteristics.As a summary,the main contributions of this thesis include the following:Firstly,in order to efficiently trade-off between latency and accuracy,the structure of YOMO is composed of depthwise separable convolution to reduce amount of parameter.At last,YOMO whose model size.is only 20 MB,can run in 50.6FPS for VGAresolution images.Secondly,using feature pyramids to mine small-scale face information contained in high-resolution feature maps.To enhance the detection performance of small-scale faces,this thesis utilize feature fusion method to combine with low-level features which have fine-grained information and high-level features which have semantic information.In addition to this,increase the quantity of detection module.To improve the multi-scale detection performance of YOMO,Each of the detection module is designed to responsible for faces of different scales.Thirdly,prediction accuracy is improved through the choice of kernel activation function and gradient-based optimization.The background filling algorithm,semi-soft random clipping algorithm and elliptic regression are proposed to improve recall rate.The background filling algorithm can keep the feature distribution of the picture in training and testing.The semi-soft random clipping algorithm can balance the number of training samples of each detection branch during training.And the ellipse regression device transforms the rectangular prediction frame into the elliptical border.Under the ContROC evaluation criteria,the recall rate of the YOMO increased by 1.5%,2.5%,and 8.7% through the above three methods,respectively.Finally,under the DiscROC and ContROC evaluation criteria of FDDB dataset,the face detection model based on the regional recommendation method,the network cascade method and the regression method are analyzed experimentally.The experimental results show that the YOMO proposed in this thesis has great advantages in terms of model volume and detection efficiency while having high detection accuracy. |