Research On Language Identification Based On Acoustic And Phonology

Posted on:2007-04-13

Degree:Master

Type:Thesis

Country:China

Candidate:G N Dai

Full Text:PDF

GTID:2178360212475732

Subject:Signal and Information Processing

Abstract/Summary:

PDF Full Text Request

Language Identification is the problem of identifying the language being spoken from a sample of speech by an unknown speaker. Language identification is one important tache in automatic speech processing, and has an extensive application foreground.In the study of acoustic, phonemes are abstracted into little phoneme symbols that are basal cells in training and identifying. Occurring of phonemes model and improved models are studied. And the language identification system based on occurring of phonemes is built. Phoneme symbols and their features are described by gaussian mixture model. Male speech and female speech classification is used in identification. The experimental results show the language identifying performances of improved models are more effective than the performance of occurring of phones model. The different models before male speech and female speech are very effective in improving the language identification accuracy.In study of phonology, language identification system based on pseudo-syllable is built. The speech signal is processed in five steps. They are speech segment, speech detection, consonant and vowel segment judgment, pseudo-syllable feature extraction and pseudo-syllable model built. The modified pseudo-syllable structure is called consonant and vowel pseudo-syllable. It is described by feature vectors which include frequency information of consonant and vowel segments, duration of consonant and vowel segments and number of consonant segments. We get the pseudo-syllable model by GMM algorithm training. And then the system based on pseudo-syllable is built. The experimental results show that the consonant and vowel pseudo-syllable is unaided and usefulness identification cell, the training time of PS model is short and the identifying performance is effective.In order to improve the accuracy of the LID system, the fusion of the above two systems using D-S evidence theory is studied, and the result is better. Performance of these systems is evaluated on the 3 languages from OGI-TS speech corpus. The best performance is 81.481% before fusion of the two systems. After fusion of the two systems, the best identification accuracy is 85.185%.

Keywords/Search Tags:

Language Identification, Gaussian Mixture Model, Vector Quantization, Occurring of Phonemes, Pseudo-Syllable, D-S Evidence Theory

PDF Full Text Request

Related items

1	Based On Vector Quantization And Gaussian Mixture Model For Speaker Recognition Technology
2	Research On Automatic Language Identification And Its Application
3	The Design And Implementation Of Automatic Language Recognition System
4	Acoustic Modeling Approach To Language Identification
5	Based On The Characteristics Of Cv Syllable Minority Language Recognition Research
6	Research On Language IDE NT Ification
7	Support Vector Machine Based Language Recognition
8	Identification, Based On The Language Of The Gmm-ubm Model
9	Based Text-independent Speaker Identification Technology
10	Language Identification Based On Gaussian Mixture Models