Font Size: a A A

Speaker Recognition Based On Sparse Representation With Short Utterance

Posted on:2014-04-11Degree:MasterType:Thesis
Country:ChinaCandidate:T L WangFull Text:PDF
GTID:2268330401464407Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Speaker recognition, the goal of which is recognizing people from their utterance,belongs to the category of biometric technology. It is widely used in the field of forensic,Internet security, military defense and so on. There are many problems in the practicaladvancement of speaker recognition technology, and the training and testing under shortutterance condition attract much attention.When the amount of data available is large enough, existing systems could providegood performance. However, if speech data is limited, especially the length of speechused for both training and testing is about10seconds; the performance of system willrapidly decrease. That’s because most of current speaker recognition systems are basedon probability statistical models, which depend on sufficient speech data. This paperfocuses on models of speaker recognition under short utterance. The main contributionsof the thesis are as follows:1. Sparse representation theory was introduced to speaker recognition under shortutterance after an analysis on classification principle of sparse representation and theability to deal with limited data. Firstly, the testing sample is sparsely represented overthe sparse dictionary combined of the training samples. Next, the representation codingresidual measured by thel2-norm is used for classification. Then, several kinds ofl1-norm minimization algorithms are described and tested their performance usingexperiment.2. The sparse coding model essentially assumes that the representation residualfollows Gaussian distribution, which may not be accurate enough to describe therepresentation errors in real condition. In this thesis, we explore robust regressionideology and weaken the restriction of the representation residual distribution. Bysupposing that the representation residual and the representation coefficient arerespectively independent and identically distributed, the sparse representation is shapedas a sparsity-constrained robust regression problem, named robust sparse representationmodel. Experiments on speech databases demonstrate that our method efficientlyestimates representation residual and improves robustness and performance of out the system with its best recognition rate at99.31%.3. After In-depth analysis of working mechanism of sparse coding model, we getthat in addition to sparse representation, collaborative representation using the trainingsamples from all classes to represent the testing sample also contribute to classificationresults. Sparse representation, which could classify the testing sample to the class whichcould reliably represent it using less number of samples, require sufficient training dataand it can not be met under short utterance. Such a strategy can be computationallyexpensive forl1-norm constrict and influences real-time of system seriously. Accordingto the deficiency of sparse representation, by introducing the regularized least squaremethod, collaborative representation based classification for speaker recognition ispresented. Experiments show that the method not only provides good performance, butalso computes efficiently with recognition speed at0.045seconds per sample.
Keywords/Search Tags:speaker recognition, short utterance, sparse representation, robust sparserepresentation, collaborative representation
PDF Full Text Request
Related items