Font Size: a A A

Research On Identification Methods Of Chinese Dialects Based On Statistical Characteristics

Posted on:2011-09-30Degree:MasterType:Thesis
Country:ChinaCandidate:Q X WangFull Text:PDF
GTID:2178360305963797Subject:Circuits and Systems
Abstract/Summary:PDF Full Text Request
Dialects identification is the technology to make the machine to judge the dialect region according to the pronunciation of the speaker. It is useful in processing multi-language information, translation by machine, auxiliary artificial consultation, the field of public security, and so on. The statistical language identification models represented by the GMM which translate the problem of identification into the problem of estimating the distribution of speech features can get a good identifying effect. In this paper, Mandarin language and dialects of Changsha, Shaoyang, and Hengyang were identified with the model built with GMM. The main contents of the paper include:The basic principles of the dialects identification are presented. The basic methods used in dialects identification of which contain voice characteristic coefficients extraction, training model selection, test and training data match and identification are described in detail.The methods of the extracting basic voice characteristic coefficients which used for identification were studied. Dynamic characteristic coefficients were extracted after differential treatment for post-secondary feature extraction for the reason that Chinese dialects are all tone languages. New voice characteristic coefficients which can more comprehensively reflect the speech feature were got by assembling basic voice characteristic coefficients.Dialects identification model based on Gaussian mixture model (GMM) was built. The methods of selecting model parameters were studied. Dialect identification experiments which build with GMM were conducted. The performance of the identification models which trained by different voice characteristic coefficients was analyzed. The simulation results show that the dynamic characteristic coefficients have advantages in noise immunity which can increase the robustness and identification rate of the system.In order to solve the problem of more mixtures bring to identification system, a new dialects identification method which take use of SOM neural network to classify voice characteristic coefficients and use GMM as a basic identifying model was proposed. The proposed method use SOM neural network to classify voice characteristic coefficients, then built identification model based on GMM for each classification, finally sum fusion the identification results of each sub-model. The simulation results show that the proposed method can improve the performance of the identification system and had feature of better practicability.
Keywords/Search Tags:Chinese dialects identification, speech feature, dynamic characteristic coefficients, Gaussian mixture models (GMM), self-organizing map neural network (SOM)
PDF Full Text Request
Related items