Font Size: a A A

Speaker Identification And Authentication Based On Lip In Complex Scenarios

Posted on:2021-04-07Degree:MasterType:Thesis
Country:ChinaCandidate:J H SunFull Text:PDF
GTID:2518306503973499Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
In recent years,user identification and authentication based on human biometrics have received more and more attention.In addition to human faces,fingerprints,iris and lips are also a highly distinguishing biological feature.Compared to other biological features,lips contain both static and dynamic features.Recent research shows that the static and dynamic features of a lip utterance contain abundant identity-related information,which can be used as a new type of biometrics to identify the user's identity,but how to extract the most discriminating sub-segments from the lip sequence is still a difficult point.At the same time,since a large number of application scenarios of lip features are in natural environments,the problems caused by lighting diversity in such complex scenarios cannot be underestimated.In view of the above problems,this paper proposes a lip-based speaker identification and authentication mechanism in complex scenarios.Under this mechanism,the network calculates discriminative weights for each lip segments: The entire lip utterance is first divided into a series of overlapping segments;Then each sub-segment is sent to a 3D convolutional neural network to extract lip features for identification and authentication,meanwhile,an adaptive scheme is designed to automatically examine the discriminative power and assign a corresponding weight of each segment in the entire utterance.The final result of the entire utterance is determined by weighted voting of the results for all the segments.In addition,non-linear optimization is used for the prediction probability of the model,which increases the influence of high discriminator segments and suppresses the effect of low discriminator segments to a certain extent.In addition,considering the various lighting condition in the natural environment,illumination normalization and data enhancement are adopted to improve the robustness of the model.Experimental results show that different segments of the same utterance have different discriminative power for user identification and authentication,and focusing on the discriminative details will be more effective.The lighting processing method used in this paper can also effectively solve the problems caused by lighting in complex scenes.In summary,the data processing method and network architecture proposed in this paper can effectively improve the accuracy of speaker identification and authentication tasks.
Keywords/Search Tags:identification and authentication, convolutional neural network, discrimination, light normalization
PDF Full Text Request
Related items