Font Size: a A A

Support Vector Machine Based Language Recognition

Posted on:2010-03-20Degree:MasterType:Thesis
Country:ChinaCandidate:W H LeiFull Text:PDF
GTID:2178360302959913Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
As the stronger trend of globalization and the more closely international communication, People in the world go in and out frequently, with their economy, politics, culture and tourism, and urgently expect free communication to each other without the difficulty in their mother languages. So Language Identification, which identify the given speech utterance to certain target language, is more and more valuable in Speech Recognition, Auto Machine Translation, Defense and daily life, and has attracted widely attention of many research institutions.Generally speaking, LID can be classified by feature to based acoustic model and based phonotactic model, also to generative model and discriminative model by training method. And the combination the PR, GMM and SVM is a current popular approach. This thesis focuses on the application of SVM-based system to Language Identification, firstly introduces the front feature extraction and how to get robust channel, and then especially analyzes the GLDS and GSV kernel function. Based on the above, improvements on them are explored, and the experiment shows the great grain. Including:Firstly, we compare the commonly used MFCC and LPCC feature in principle, and result in fusing the systems based on different features. Furthermore, some techniques are the key research direction, which reduce the noise in feature domain.Based on the analysis, there is a trade-off in the original GLDS-SVM system: the mismatch between the duration of training and testing data and the training sample count, so a hierarchical framework is proposed. It splits the training utterances into different duration sets, and selects data from the short-time duration set by the SVM models trained using the long-time duration set, which can reduce the mismatch between training and testing in some degree, while maintaining the little computation load. We also explore two complementary feature sets based on a co-training style approach, so the performance by the fusion of systems can be further improved by 30%.The GSV, one of combining GMM-UBM and SVM, has promoting performance with Vocal Tract Length Normalization and the robust channel method, such as the Nuisance Attribute Projection (NAP), Factor Analysis (FA), which is the one of state-of-the-art system. But the feature dimension increases by 2 times with the mixture of gauss, and with the high redundancy, heavy computation. In order to resolve it, on the basis of hierarchical framework in GLDS-SVM, Kernel Principal Component Analysis and Key Selection are introduced in to reduce dimensions from big mixture and enhance the discrimination, which are very useful.
Keywords/Search Tags:Language identification, support vector machine, hierarchical framework, Gaussian mixture model, nuisance attribute projection, key selection
PDF Full Text Request
Related items