| With the popularization of face recognition technology,some inherent security issues of the technology have attracted people’s attention.Face recognition is widely used in various identity authentication systems due to its convenience,accuracy,and contactless features.However,due to the convenience of obtaining face information,after attackers obtain users’ face information through social media,sneak shots,etc.,they use printing attacks,replay attacks,and mask attacks to pose a huge threat to user privacy and property security.Therefore,identity verification based on face recognition technology to confirm whether the user is himself or not is very important to ensure the practical application of face recognition technology.This is also an important prerequisite and guarantee for the widespread application of face recognition technology.Face liveness detection,also known as Face Anti-Spoofing and Face Presentation Attack Detection,is a technology to detect whether the visiting user is himself or a cheater,and the silent face liveness detection algorithm shows weak generalization ability in the face of crossdomain situations.This thesis focuses on the subject of live face detection,based on Vision Transformer and auxiliary information supervision methods.The main research and innovations of the thesis are as follows:(1)A face liveness detection algorithm based on Vision Transformer and depth information supervision is proposed.In order to solve the problem that Vision Transformer is difficult to use auxiliary information for supervision,a depth generation module is designed to imitate the depth map estimated by the face 3D reconstruction algorithm,so that the algorithm pays attention to the difference in depth information between the real face and the spoofed face,thereby guiding The learning direction of the characteristics of the algorithm.(2)A face liveness detection algorithm that combines depth information and background information is proposed.In order to solve the problem of few deception cues obtained by single-modal silent face detection,a fusion of background information is proposed as a supplement to deception cues.Besides,in order to reduce the influence of noise in the background information,the structure of the main and auxiliary branches is used.The collected original frame information is used as the input of the complementary branch,and the face image is used as the input of the main branch,and the algorithm is further improved by using multi-scale information and depth information.performance.(3)A face liveness detection algorithm that combines timing information and depth information is proposed,and the video Transformer method is used to explore the field of live face detection.In order to avoid neglecting the exploration of facial spatial features when exploring temporal information,a multi-scale depth map supervision idea is proposed,and a multi-scale supervision module is designed to supervise the input image of each frame.At the same time,a scoring method of joint timing information and multi-frame depth information is proposed for this model,which further improves the classification accuracy of the model. |