Prosody Extraction And Description Of Chinese Mandarin Continuous Speech

Posted on:2008-03-03

Degree:Master

Type:Thesis

Country:China

Candidate:W W Wang

Full Text:PDF

GTID:2178360212493460

Subject:Circuits and Systems

Abstract/Summary:

PDF Full Text Request

Speech is one of the most important interactive approaches to information. Since the 1990s of the 20th Century, speech techniques entered a period of rapid growth in China. With the fast development of the Chinese speech coding, recognition, synthesis, conversion and other speech techniques, related production has been widely used in many areas. Speech has broken through the bottleneck of delivering information mainly by speaking and listening with voice as the only medium. By using multiple medium such as intellectualized terminal and internet work, expression and delivery of information are getting more and more convenient and faster with the combination of speech, characters and images.Much implicit information can be expressed by a prosodic way rather than characters when people communicate, thus, prosodic modeling plays an important role in speech synthesis system. In embedded Text to Speech (TTS) system, limited resource makes it impossible to get fully accordant unites with context, so it's difficult to make the synthesized pronunciation naturally. The method of combining speech analysis and text analysis is investigated to get the information of prosody. Prosody extracting from speech or/and text and labeling can be taken on servers, leaving information parsing and speech synthesizing on intellectualized terminals such as mobile telephone, PDA, etc.This paper is to improve the naturalness of the synthesized speech, with processing and analyses in speech. It takes flowing work to obtain prosody. First, some basic phonetic processes are taken to the Mandarin continuous speech. Second, prosodic information is taken from the continuous speech based on the acoustic character of the prosody. Finally, the prosodic information obtained from speech is labeled in a standard way.When the basic phonetic processes are taken to the continuous speech, syllables are separated, F0 is tracked and F0 contours are smoothed for each syllable. As far as prosodic information extraction is concerned, recognitions of syllabic tone, prosodic boundary, stress and intonation are taken. The HMM technique is used in both tone recognition and intonation judging, and the Neural Networks technique is used in stress judging. The results of prosody extraction are basically accorded with acoustical judgment based on acoustical test. Finally, SSML1.0 is extended and used in prosody labeling to describe the prosodic information correctly.

Keywords/Search Tags:

prosody extraction, syllable separation, SSML

PDF Full Text Request

Related items

1	Research On Syllable Lattice Based Chinese Spoken Document Retrieval Method
2	Based On The Characteristics Of Cv Syllable Minority Language Recognition Research
3	Recognition Of Handwritten Tibetan Syllable Words
4	Research On 3D Visible Speech Animation Driven By Prosody Text
5	Syllable-based Method Of Tone Recognition For Chinese Continuous Speech
6	The Study Of Prediction Methods On The Mongolian Prosody
7	Research On Chinese Syllable Evaluation Approach After Automatic Speech Recogniton
8	With Noise Fanaticism, Extraction And Separation Technology Research
9	Extraction Of Tibetan Syllable Based On Feature Recognition
10	Research On Acoustic Analysis And Prosody Modeling For Xian-Dialect