| Currently,most facial recognition,human-machine interaction,and liveness detection technologies are based on processing visible light images of faces.Visible light images of faces are easily affected by the intensity of lighting conditions,and much of the feature information in the image can be lost when external lighting conditions are poor.Infrared facial images can avoid this disadvantage by using their thermal radiation characteristics,but the detailed information in a single infrared facial image is not comprehensive enough.Fusing these two types of images can complement each other’s shortcomings.Therefore,this project mainly focuses on the research of algorithms for fusing infrared and visible light images of faces.The facial images used in the fusion experiments conducted in this study are from the USTC-NVIE database of the University of Science and Technology of China.Unlike general image fusion,infrared facial images and visible light facial images are obtained from different devices,which may differ in size,angle,and position,and the original database contains a significant amount of useless background information.To ensure the quality of the fused facial image by eliminating these interfering factors,preprocessing is required on the source facial image to generate a "standard facial image" for fusion.In the pre-processing stage,it is necessary to detect the human eyes in the image to determine their coordinates,facilitating the subsequent normalization of the facial image.To address the human eye detection,a specialized Eye-YOLO model for detecting small targets such as human eyes is proposed based on the YOLOv5 firststage object detection model.The Eye-YOLO model has been validated to achieve high accuracy in detecting human eyes.The experiment summarized the traditional algorithms and deep learning algorithms currently used for infrared and visible image fusion.Based on the generative adversarial network algorithm,improvements were made to the existing algorithms,and the residual attentional generative adversarial network(RAFGAN)algorithm was proposed for infrared and visible image fusion of human faces.The RAFGAN algorithm uses multi-scale convolution(PSConv)to replace traditional convolution for convolution operation from a finer granularity perspective,introduces the residual attention module(Res ECA-Net)into the generator network structure to improve the network’s ability to extract and save facial image feature information,and incorporates various loss functions such as structural loss,perceptual loss,infrared gradient loss,and visible light intensity loss into the generator’s loss function design to optimize the original generative adversarial network from multiple angles.Subsequently,the effectiveness of the RAFGAN algorithm improvement was verified through ablation experiments.Finally,in order to verify the human face image fusion effect of the algorithm,the RAFGAN algorithm was compared with several commonly used image fusion algorithms using a combination of subjective and objective evaluations,and the quality of the face fusion images was assessed through multiple objective indicators.The analysis results showed that the RAFGAN algorithm achieved good performance in all objective quantitative indicators,proving that the RAFGAN algorithm is an efficient algorithm for infrared and visible image fusion of human faces. |