Font Size: a A A

Real-time Face Monitoring Based On Convolutional Neural Network Detection And Application

Posted on:2022-07-14Degree:MasterType:Thesis
Country:ChinaCandidate:Y H LiFull Text:PDF
GTID:2518306515972729Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
As a non-contact and unique biological feature on the human body,the face occupies an important position in daily identity verification.With the continuous advancement of society and the urgent need for rapid identity verification,people hope that computers can accurately and quickly detect human faces,thereby enhancing the existing intelligent experience.Face detection technology has broad application prospects as a prerequisite for computer face recognition.It is widely used in face recognition and verification,face tracking in surveillance situations,facial expression recognition,facial attribute recognition,facial lighting adjustment and Deformation,image and video retrieval and other fields.In recent years,face detection has become the main research direction of computer vision,attracting a large number of scholars to conduct research.Face detection using convolutional neural networks based on surveillance video is the main research content of this topic.The specific work is as follows:1.In view of the slow detection speed of the deep-level convolutional neural network and the serious loss of face features for small targets,and the insufficient extraction of facial features by the shallow-level convolutional neural network,the YOLOV3 network structure is selected as the backbone.YOLOV3 is composed of Dark Net53 network and feature pyramid FPN.YOLOV3 convolutional network has a moderate number of layers,and uses the feature pyramid to combine the features learned at the end of the network with the features at the beginning of the network.The YOLOV3 network combines the CBAM attention mechanism and the receptive field module,and adds different numbers of receptive field modules and attention mechanisms to the backbone.The final comprehensive consideration on the widerface data set results in the best network structure: easy: 69%;medium : 66%;hard: 40%.2.Aiming at the poor performance of YOLOV3 network in face detection,especially on the hard data set in widerface,the S3 FD network structure is selected as the body.The S3 FD network structure is more effective in terms of face detection and speed.By modifying the network layer to different number of receptive field modules and attention mechanisms in the S3 FD network,and adopting the feature pyramid idea in the YOLOV3 network,the features learned by the neural network after multi-layer convolution are up-sampling and the starting volume Combine feature maps with less information loss for small targets during product time,and use all feature pyramid paths to pass through max-out and feature map variables at most one level through max-out in the selection of anchor points to reduce variables and reduce the calculation used In time,the final comprehensive consideration on the widerface data set results in the best performance of the network structure: easy: 95.0%;medium:93.7%;hard: 86.4%.Compared with the original S3 FD,it has increased by 1.3%,1.2%,and 0.5% respectively.The detection effect is obviously improved,but the detection speed of the modified network structure is worse than that of YOLOV3.3.The application for face detection is mainly embodied in face counting,and pave the way for face recognition in the future.In view of the slow detection speed of the modified network structure based on the existing network,the way to deal with it is to divide the targets in the video into three categories: persistent targets,disappearing targets,and suddenly appearing targets.According to the characteristics of the appearance of the three types of targets,the difference between the two frames of the video is used as the threshold for whether there are any objects that suddenly appear or disappear between the two frames of the video.If the difference between the latter two pictures is greater than the difference between the first two pictures Carry out target detection,and for the sake of safety,the video is detected every 5 frames at a fixed interval,so as to improve the speed and accuracy of video detection by the network.
Keywords/Search Tags:Convolutional neural network, attention mechanism, face detection, image processing
PDF Full Text Request
Related items