Font Size: a A A

Text-independent Speaker Age Recognition Based On GMM

Posted on:2016-09-06Degree:MasterType:Thesis
Country:ChinaCandidate:H YanFull Text:PDF
GTID:2308330482965996Subject:Electronic and communication engineering
Abstract/Summary:
Pattern recognition is a kind of application of artificial intelligence techniques. Its central idea is to construct an intelligence model inside the computer which mimics the human intelligence. It is trained using available information(data) so that its internal parameters are adapted and approximate to the “true” situation based on certain criteria. What we have done is to apply pattern recognition technology in the field of voice recognition so that we can get a rough estimation of the ages of speakers. We extracted Mel Frequency Cepstral Coefficient(MFCC) of the speaker voice as feature parameters when we trained the system. Then we modeled the speaker feature of different ages by utilizing Gaussian Mixture Model(GMM). We also applied Universal Background Model(UBM) combined with GMM so that we can increase the accuracy and reduce training time. We utilized the same feature of MFCC as in the training phase to do recognition. We drew conclusion by comparing posterior probability with respect to speaker models of different ages. The result shows that by combining these art-of-state technologies, we can receive fairly good result only relying on limited training data source, which can provide information for further studies.The main aspects of this study include: 1. Analysis of speaker voice features including voice energy, voice frequency, MFCC etc. 2. Applying GMM to different groups and training. Analyzing the result. 3. Applying GMM-UBM to the training data. Analyzing the result. 4. Selecting different training and testing data combination. Analyzing the results in different situations.
Keywords/Search Tags:GMM, UBM, MFCC, MAP, speaker recognition
Related items