Font Size: a A A

Research On Algorithms Of Speaker Recognition Based On Deep Learning

Posted on:2020-01-07Degree:MasterType:Thesis
Country:ChinaCandidate:S Y GengFull Text:PDF
GTID:2428330602950382Subject:Engineering
Abstract/Summary:PDF Full Text Request
Voice signals are one of the most important means of communication in human life.Everyone's voice has its own unique characteristics.In theory,voice is just like human fingerprint.Few two people have the same voiceprint characteristics.So through the different voice characteristics of different people,we can effectively distinguish the voices of different people.Speaker recognition technology is a biometric recognition technology that automatically recognizes the identity of a speaker based on the characteristic parameters of the speaker's physiological and behavioral characteristics reflected in voice signal.Compared with other identity authentication foundations like fingerprint authentication,face recognition and pupil recogntion,speaker recognition has the advantages of being easy to be accepted by users,low equipment cost,and good scalability.Deep learning has achieved remarkable results in the fields of pattern recognition,speech recognition,and image processing in recent years.Compared with traditional shallow leraning,deep learning emphasizes the depth of deep neural network models,and clarifies the iinpartance of feature learning in deep neural networks,making the classification or recognition of samples easier and easier.Combining speaker recognition and deep learning can greatly increase the accuracy of speaker recognition,thus promoting the application of speaker recognition technology in the field of identity authentication.The work of this thesis is as follows,recording two different voice libraries.The Voice Library 1 was recorded by a group of 14 students from the lab in a quiet and undisturbed office.The Voice Library 2 is recorded by a company's 50 employees in a large working room.At the same time,the voice data of fifty people in the open source Chinese Mandarin database on the Kaldi platform was selected as the Voice Library 3.These three voice librarys are used as training and testing data for subsequent experiments.A traditional speaker algorithm based on MFCC and its delta information is implemented,which uses VQLBG algorithm to cluster.The performance changes of the system under different parameters are studied by using different characteristic parameter combinations and code essence parameters.Using the three voice libraries for training and testing,the highest recognition ratios were 97.14%,73.12%and 98.26%respectively.A speaker recognition system based on DNN network is designed and implemented.Select Voice Library 2 to study the changes of system performance under different feature parameters and hidden layer nodes.The highest recognition ratio was 80.13%.Selected Voice Library 3 and fixed the feature parameters and the network layers.Multiple trainings and tests are performed to study the fluctuation of system performance.The recognition ratio ranged form 96.36%to 98.07%,and the average recognition ratio was 97.36%.A gender-based speaker recognition algorithm is proposed and implemented.Select Voice Library 3 to perform training tests on the system multiple times under fixed parameters and network layers conditions to study the changes of system performance.The recognition ratio ranges form 97.80%to 98.56%,and the average recognition ratio is 98.07%.Compared with the DNN-based speaker recognition algorithm,the gender-based speaker recognition algorithm improved the recognition ratio,and the fluctuation range of the recognition ratio is significantly smaller.
Keywords/Search Tags:Speaker Recognition, Voiceprint, MFCC, Deep Learning, DNN network, gender recognition
PDF Full Text Request
Related items