Font Size: a A A

Building A Mandarin-Tibetan Speech Corpus With Multi-modal Physiological Information And Phonetic Feature Analysis

Posted on:2018-01-20Degree:MasterType:Thesis
Country:ChinaCandidate:R J C LuFull Text:PDF
GTID:2348330542479627Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of information processing of Tibetan,the research of Tibetan speech has attracted more and more researchers.Many theories and methods of experimental phonetics have been applied in Tibetan speech processing.However,there is no complete bilingual multimodal physiological phonetic database for TibetanMandarin,which can be used for investigating phonetic correlations between Tibetan and Mandarin.Building a complete and precise Mandarin-Tibetan speech corpus with multi-modal physiological information is very significant for fundamental research of Tibetan phonetics and other research fields.In this paper,we mainly focus on WeiZang dialect,which is one of the three Tibetan dialects.First,we build a Mandarin-Tibetan speech database with multi-modal physiological information.The Mandarin-Tibetan corpus should contain 41 Tibetan sentences,27 Chinese sentences,30 Tibetan consonants,4 Tibetan vowels and 25 Tibetan monosyllables.We collect multi-modal data based on the data collecting system which is supported by Terason Ultrasound,High-speed digital imaging and Electromagnetic Articulograph.And then,we manually label the wave audio recorded by the system,and finally obtain a 2.5 TB physiological speech database.After the database is established,we firstly make a comparison between Tibetan and Mandarin vowel spaces produced by Tibetan speakers,and we found that the Mandarin space produced by Tibetan speakers was affected by the vowel space of their mother tongue.And we also make a comparison among Tibetan,Mandarin and English vowel spaces produced by respective native speakers.We can see that the size of the Tibetan vowel space was similar to the Mandarin and English vowel space.It suggests that the vowel spaces show no obvious effects due to the vowel numbers.
Keywords/Search Tags:Mandarin-Tibetan bilingual, Multi-Modal, Speech, Physiological, Database, Vowel
PDF Full Text Request
Related items