Font Size: a A A

Research On Text Dependent Speaker Recognition For Tibetan Amdo Dialect

Posted on:2019-08-21Degree:MasterType:Thesis
Country:ChinaCandidate:B LiuFull Text:PDF
GTID:2428330545981733Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
Intelligent equipment has gradually entered the people's life with the rapid development of science and technology.As the most central part of the human-computer interaction has also slowly changed from the manipulation of the fingers and gestures to the natural and efficient way for humans to communicate-speech.There are two main ways that speech embodies human-computer interaction.One thing is to let the machine make sounds on its own,and the other thing is to let the machine understand the words spoken by humans.As a product of current artificial intelligence,speech recognition is divided into semantic recognition and speaker recognition.As the name suggests,semantic recognition is what lets the machine understand what the person says,and speaker recognition is to make the machine identify the identity of the speaker.This thesis is a combination of the two to carry out related research on Tibetan dialects,and based on the hidden Markov model in the HTK platform to allow the machine to recognize the identity of the speaker,but also to identify the semantic content.First,the speaker is recorded,a corpus is created,then the corpus is preprocessed,then the speech signal is characterized,the MFCC feature parameters are extracted,and the HMM is established.Then the established model and feature parameters are matched,and the meaning represented by the highest probability value is the speaker's identity and semantic content.The main research work in this thesis is as follows:Firstly,The thesis established the Tibetan Amdo dialect database.For the Tibetan dialect,the laboratory does not currently have a corpus.Therefore,this thesis randomly selected 6Tibetan speakers of Amdo dialects,4 men and 2 women respectively,aged 18-20 years old,recording a total of 60 sentences and 120 words.The established statements and isolated words lists are used for speaker recognition and semantic recognition,respectively.Followed by the pretreatment of corpus.Secondly,the thesis extracted feature parameters.For speaker recognition,feature analysis is performed on the preprocessed speech signal and feature parameters are extracted.In this thesis,the feature parameter MFCC commonly is used in speaker recognition.As for the semantic recognition,since the Tibetan Amdo dialect is used in this thesis,Tibetan transcription,establishment of grammar dictionaries,and speech annotation are prerequisites for the successful conduct of experiments.Thirdly,the thesis established a model library.The feature parameters extracted from the training stage are used to establish the speaker model library and the isolated word model library respectively.Then,the probability parameters are compared with the feature parameters of the recognition stage,the maximum value is selected,the recognition result is obtained,and the every recognition results of the speaker recognition and the semanticrecognition are counted,draws a double recognition rate that simultaneously identifies the results.Experiments show that the average recognition rate of the speaker is 71.9%,the average recognition rate of the semantic recognition is 88.3%.In the end,the average recognition rate of the both identified is 58.4% at the same time.
Keywords/Search Tags:speech recognition, semantic recognition, speaker recognition, MFCC, HMM
PDF Full Text Request
Related items