Font Size: a A A

Research On Language Recognition Based On TV Modeling In DBN-UBM-DBF System

Posted on:2018-12-23Degree:MasterType:Thesis
Country:ChinaCandidate:T QiFull Text:PDF
GTID:2348330512485628Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Language recognition(LR)refers to the process that automatically distinguishs which language an unknown utterance belongs to by using the computer to analyze and process the arbitrary given length of the speech sample,which is an important direction of speech signal processing.LR technique is becoming a genuine research hotspot in the recent 20 years.The gradual maturity of the theoretical algorithm research has stepped through pushing forward the change from LR technology to the practical application.The method that i-vector extracting by TV modeling is used as a new feature of speech sample in the LR system has been widely used by researchers to construct their own LR system for its mature theory and outstanding performance.This dissertation is aimed at obtaining the i-vector representation to describe the sufficient language information of utterances and then solving some practical problems existing in work to find more excellent LR methods which is applicable to different languages and different test samples.The main works obtained in this dissertation are outlined as follows:1.The TV modeling based on DBN-UBM-DBF system is studied.Firstly,based on the introduction of the traditional TV modeling,the complete process of extracting the i-vector of segment in the DBN-UBM-DBF system is expounded in detail that uses the output information of different layers of the DBN.Then noise compensation method for is described and analyzed.Finally the experiment given the default configuration and the performance of the baseline system which providing a unified performance benchmark for subsequent research.2.Several mainstream language recognition back-end methods in i-vector space is fully analyzed and compared.Firstly,the existing algorithms are clearly summarized and classified and the application in LR are introduced in detail.Then,using the development set to determine both he required configuration parameters and related implementation details of various methods,and different performance indicators are evaluated on test set.Finally,according to the test results,the performance of difference methods under different test time conditions is compared and the advantages and disadvantages between the methods are further summarized,which guides the follow-up improvement work.3.An improved CDS algorithm based on a prior knowledge in language interclass variance is proposed.Firstly,for the practical performance bottlenecks of CDS,we introduce the prior knowledge of the language interclass variance in i-vector space on the basis of CDS.And in order to reduce the recognition errors caused by i-vector dimensions' significant differences to performance,we further weighted interclass variance to achieve the improved CDS.Finally,the performance is tested and compared with the baseline to verify the effectiveness of the improved algorithm.4.An adaptive Gaussian backend LR method based on LDOF criterion is proposed.Firstly,for the problem of mismatch between test samples and trained models which are derived from the diversity of classes,we present an AGB LR method which is related to test samples for this mismatch.Then the local distance-based outlier factor(LDOF)criterion is defined to guide the test samples and dynamically select the appropriate training data subset similar to the testing samples from multiple class training sets.Finally,the validity of the proposed algorithm is proved by experimental results.
Keywords/Search Tags:Language recognition, TV modeling, Cosine distance score, Principal component analysis, Interclass variance, Adaptive Gaussian backend, local, distance-based outlier factor
PDF Full Text Request
Related items