In the Post-Epidemic Era,personal protection is still a key concern that needs to be addressed.Additional highly contagious diseases at the domestic and international levels still require high alertness and effective prevention and control measures to always be prepared for the next outbreak.In modern society,where diseases spread faster and wider than ever before,wearing a mask is considered an important preventive measure.For the face mask detection task,1214,3517,and 7516 images are screened in Baidu images,Wuhan University CFMD dataset,and AIZOO dataset respectively,forming a total dataset of 12247 images.In this thesis,based on the above dataset,a series of improvements are made on the basis of the YOLOv4 algorithm to make the algorithm performance meet the real-time requirements of face-mask detection task as much as possible,and the main works are:(1)In the feature extraction part,the feature extraction part of the original algorithm is replaced with Mobile Netv2,a lightweight network,to compress the number of parameters of the original algorithm,reduce the computational effort and improve the model detection speed.The amount of parameters of the replaced model is reduced,and the experiment proves that the model detection speed index(FPS)of the replaced network increases significantly.(2)In the feature fusion part,in order to allow feature information to further fuse lower and higher layer semantic information and improve the ability to detect small targets,a weighted bi-directional feature pyramid network(Bi FPN)is proposed to replace the PANet in the original algorithm.Experiments show that the algorithm using a lightweight feature extraction network and Bi FPN-optimized feature fusion network has a slight decrease in accuracy overall,but the detection speed improves by 15.1 FPS and the number of parameters is significantly reduced.(3)By drawing on the idea of attention mechanism to make the model pay more attention to the important feature information,we introduce the squeezed incentive network(SENet)and CBAM attention mechanism and design the ablation experiment.The experiments prove that the CBAM attention mechanism is more helpful to improve the detection accuracy of the model,so the CBAM attention mechanism is chosen to further optimize the above algorithm.(4)The above three network structure improvement experiments prove the improvement of detection accuracy and detection speed of the new algorithm,but the detection speed index still does not meet the real-time detection requirements.For this reason,the post-processing and pre-processing stages of the algorithm are further optimized: for the original YOLOv4post-processing process,the CIOU loss function width-to-height ratio parameters are set unreasonably and other problems,two loss functions,EIOU+Focal and SIOU,are introduced,and the ablation experiment is designed.Meanwhile,for the requirements of the mask detection network on detection speed and detection accuracy,four sets of comparison tests with different input sizes are designed in the pre-processing stage,and the most suitable input image size for this detection task is selected after weighing the accuracy and speed indexes.After the tradeoff according to the experimental results,the EIOU function is finally selected to replace the original CIOU function,the Focal Loss idea is introduced to solve the sample quality imbalance problem,and 480×480 is selected as the input image size.The final improved algorithm in this thesis increases the detection speed to 45.8 FPS with the same accuracy as the original YOLOv4,which meets the requirement of real-time detection. |