Font Size: a A A

Research On Speaker Verification In Complex Scenarios

Posted on:2023-01-19Degree:MasterType:Thesis
Country:ChinaCandidate:Y C QinFull Text:PDF
GTID:2568306776475414Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Speaker verification is a biometric recognition technology that uses voice print information to determine the user’s identity.With the development of artificial intelligence,intelligent equipment represented by intelligent speaker and intelligent TV is gradually popularized,and speech technology has been widely used.However,to ensure the security of these speech technologies,speaker verification is often used to ensure that it is only used by specific users.In addition,speaker verification can also be applied to the public security big data monitoring,financial securities,national defense and military fields.In simple terms,the process of speaker verification is to determine whether two utterances belong to the same speaker.Currently,the mainstream method uses deep neural networks to extract utterances into speaker embeddings which represent speaker identity information,and then verifies the identity by comparing the similarity of the speaker embeddings of registered and test utterances.In the quiet environment,the x-vector system,one of deep speaker embedding models,has achieved certain performance improvement.However,utterances recorded in the real life contains various kinds of noise and reverberation,which cause the performance of existing speaker verification models to drop sharply.There are still challenges in the field of speaker verification.To address these problems,we propose an auxiliary adversarial task and a multi-branch feature aggregation method based on multiple attention weighting for speaker verification,respectively,and the main contents and innovations of the thesis are as follows.(1)A speaker verification method based on auxiliary adversarial task is proposed.In order to solve the mismatch in speaker embeddings space caused by the change of speaker’s position to microphone in far field,an auxiliary adversarial task based on position classification was proposed to disintegrate the position information in speaker embeddings.The gradient reversal layer inverts the gradient from the auxiliary task and propagates it to the speaker verification task,which makes the speaker verification task opposite to the optimization goal of the auxiliary task and achieving adversarial training.Experiments on the mandarin far-field speech dataset HI-MIA show that the Proposed method improves the metrics of equal error rate and minimum detection cost function by 10% and 3%,compared to the baseline model.(2)The multi-branch feature aggregation method based on multiple weighting was proposed to solve the problem of multiple interference in complex environment for speaker verification.This method uses channel attention and point attention to aggregate multi-branch features from different perspectives,which can extract speaker discriminative information and suppress noise interference.Experiments on natural environment speech dataset Voxceleb show that the proposed method has a 6% improvement over the baseline model in equal error metric,and experiments on other dataset Cnceleb show that the proposed method has good generalization performance.(3)The prototype system for speaker verification was implemented.Based on the above research results,the speaker verification prototype system is implemented by python programming language.The system mainly includes speech recording,speech signal preprocessing,speaker feature extraction,similarity comparison and graphical interface display modules.The system provides functions such as voiceprint registration and dynamic password identity verification,which has good application value.
Keywords/Search Tags:Speaker verification, Auxiliary adversarial tasks, Multi-branch feature aggregation, Deep learning
PDF Full Text Request
Related items