Font Size: a A A

Study On Voiceprint Recognition Algorithm Base On Deep Learning

Posted on:2021-05-13Degree:MasterType:Thesis
Country:ChinaCandidate:M H GuoFull Text:PDF
GTID:2428330629952712Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Voiceprint recognition is a high-quality identification technology that is widely used in many industries,bringing great convenience to personal property and the security of enterprises,and can also bring clear and powerful evidence to national security and judicial cases.However,due to the influence of some uncertain factors,such as age,emotion,growing environment,noise,etc.,voiceprint recognition technology has yet to be further explored and researched to further improve its basic theory and applied value.Voiceprint recognition is speaker recognition.It is a type of problem in the field of speech signal pattern recognition.It mainly consists of voiceprint feature extraction,voice feature training,and voice classification recognition.Voiceprint feature extraction is the core and key problem of the entire voiceprint recognition system and its performance.This paper uses deep learning method to optimize and improve the voiceprint feature extraction algorithm and proposes a new algorithm.During the research of voiceprint model,this paper uses traditional sequence network and convolutional network to extract features.The paper promoted a BiGE2E(BiLSTM with GE2E(Generalized End-to-End)model,which uses the generalized end-to-end loss function to improve unidirectional BiLSTM model.What's more,an attention embedded 3D-CNN is also proposed,called 3DCNNAM(3D Convolutional Neural Networks with Attention Mechanism).Specifically speaking,introducing GE2 E model with BiLSTM when constructing sequence network,context information between input layer and output layer is better used.Well trained features are evaluated with similarity matrix,to compare the similarity between individual speaker's voiceprint feature embedded vectors with the centroid of all speakers.Experimental results suggest that,under the same open source TIMI sample,BiGE2 E is much better that GE2 E.When construct convolutional network,attention is embedded into 3D convolutional network.To restrain useless learning feature,target area's effectiveexpression is strengthened in terms of space and time.When emerging model's self-adapted learning feature,the same amount of speaker voice input networks are deployed so as to extract speaker related information as well as internal voice feature variety,and then mark the similarity using cosine distance.Experimental results reveal that 3DCNNAM is better than 3DCNN.While the number of layers is not the more the better,the performance of introducing CBAM(Convolutional Block Attention Mechanism)only once is better than twice.In summary,this work mainly concerns introducing sequence network and convolution network to identify voiceprint,and proposed improving strategy to optimize voiceprint recognization model and the experimental result suggests a good performance of our work.
Keywords/Search Tags:Voiceprint recognition, deep learning, BiGE2E, 3DCNNAM
PDF Full Text Request
Related items