Font Size: a A A

Research On Key Technologies Of Speaker Recognition Based On Deep Learning

Posted on:2022-10-09Degree:MasterType:Thesis
Country:ChinaCandidate:W WangFull Text:PDF
GTID:2518306524493894Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
With the development of intelligent life,speaker recognition and speaker attribute classification are widely used in the fields of identity authentication,public security and smart-home.However,due to the complexity of the actual application scene,the existing voiceprint recognition and speaker attribute classification technology based on deep learning can achieve high recognition effect in the ideal silent environment,but its robustness and recognition accuracy for environmental noise still need to be improved.This thesis aims to research speaker recognition and speaker attribute classification system with high robustness,and improve its accuracy in complex environment.The specific work contents are as follows:1.This thesis proposes an improved residual network based on attention mechanism and an improved triple loss speaker recognition model,which takes the spectrogram with more speaker speech features as the input of the network,uses the improved triple loss ctriplet to control the gap between classes and within classes,and constrains the distance within classes to obtain better recognition performance.Finally,the network is verified on Chinese pure speech set,English pure speech set and English noisy speech set respectively,and the robustness of the network to noise and language is proved.2.This thesis proposes to extract high-level bottleneck features of speech based on deep belief network,and splice them with MFCC to form a hybrid feature b-mfcc,and then designs a speaker attribute classification model based on Dense Net network and bmfcc features.In addition,some data sets are selected in Mozilla common voice,and compared with the traditional MFCC feature-based speaker attribute classification to verify its performance.3.Based on the above work and the analysis of the target users' needs,this thesis designs and implements the prototype system of the recognition of the voice pattern,which can meet the basic tasks of the recognition of the voice lines and complete the relevant tests on the system.
Keywords/Search Tags:Speaker Recognition, Speaker Attribute Classification, Deep Learning, Triple Loss, Deep Belief Network
PDF Full Text Request
Related items