Research On Identity Authentication Algorithm Based On Deep Learning

Posted on:2022-06-28

Degree:Master

Type:Thesis

Country:China

Candidate:J P Qiu

Full Text:PDF

GTID:2518306740496464

Subject:Signal and Information Processing

Abstract/Summary:

PDF Full Text Request

Biometric identification technology refers to the use of computers to identify and authenticate the physiological or behavioral characteristics of the human body.With the continuous improvement of the computing power of computers and the popularization of the Internet,biometric identification technology has developed rapidly in recent years and has been widely used in social life.Among them,face recognition technology and voiceprint recognition technology are two widely used single-modal biometric recognition technologies.With the rapid development of deep learning technology,while the single-modal biometric recognition technology is widely used,due to the need to expand the application scenarios,the research on the multi-modal biometric recognition technology is also increasing.Therefore,this paper mainly studies the identity authentication algorithm based on deep learning,including face recognition,voiceprint recognition and multi-modal speaker recognition.The main work of this paper is as follows:First,aiming at the long-tail distribution problem of the face recognition database,an improved loss function minimum marginal loss is proposed to optimize the distance between classes and enhance the ability to discriminate depth features.By adding different combinations of loss functions to the deep learning network,comparative experiments were carried out on multiple face data sets.The final experimental results show that the improved loss function improves the ability of face recognition,reduces the negative impact of the long tail distribution of data,and lays the foundation for subsequent multi-modal recognition experiments.Secondly,in order to study the needs of multi-modal fusion experiments,the voiceprint recognition algorithm is studied.The voiceprint feature parameter is mainly the extraction method of Mel cepstrum coefficients.At the same time,the deep learning network VGGNet and Res Net were used on the foreign language speech data set Vox Celeb to conduct experiments respectively.By comparing with the traditional voiceprint learning method,The ability of deep learning to extract audio features is verified.Finally,in view of the problem that speaker recognition in video is susceptible to video quality and speaker activity in a single modal,a method of multi-modal recognition combined with two modalities of face and voice is studied.For the feature fusion of the two modalities of face and speech,the decomposition bilinear fusion method is used to improve the effect of feature fusion.An end-to-end network model based on the attention mechanism is proposed to preserve the semantic consistency of the face and audio,so that the model can better pay attention to the key points of the face.Finally,the decomposing bilinear fusion method and the end-to-end network model based on the attention mechanism are compared and tested on the BBT and Friends data sets.The experimental results show that the method proposed in this paper can more effectively integrate face and audio information.And it is helpful to better extract the key features of multi-modality to improve the recognition accuracy.

Keywords/Search Tags:

identity authentication, deep learning, attention mechanism, feature fusion, video analysis

PDF Full Text Request

Related items

1	Research On Feature Fusion Strategies Of Attention Mechanism In Image Description
2	Research On Sentiment Analysis Method Of Multi-feature Fusion Based On Attention
3	Method And Application Of Text Sentiment Analysis Based On Fusion Of Surface And Deep Features
4	Research On Feature Fusion Strategy Of Recommender Based On Deep Learning
5	Image Semantic Segmentation Based On Multi-level Feature Fusion And Attention Mechanism
6	Video Behavior Analysis Based On Deep Learning
7	Video Question Answering Based On Deep Learning
8	Research On Video Caption Generation Depth Model Based On Video Temporal Attention Level Fusion Mechanism
9	Human Action Recognition Based On Attention Mechanism And Multi-Modality Feature Fusion
10	Research On Semantic Analysis Method Of Market Stall Monitoring Video Based On Multi-scale Feature Fusion