Font Size: a A A

Research And Implementation Of The Prosodic Adjustment Algorithm For Mandarin Text-to-speech System

Posted on:2011-10-01Degree:MasterType:Thesis
Country:ChinaCandidate:L WangFull Text:PDF
GTID:2178360308952462Subject:Circuit and System
Abstract/Summary:PDF Full Text Request
TTS (Text-to-Speech) is a kind of widely applied speech technology. At present, TTS technology with waveform splicing can synthesize well voice in clarity and intelligibility, however, the naturalness of synthesized speech still needs improving. Prosodic adjustment is one of the most effective ways to improve the naturalness of synthesized speech.The main task of this thesis is carried out focusing on the prosodic adjustment. Research on related algorithm is conducted, and a mandarin TTS system has been developed.In this thesis, above all, the rhythm theory and the characteristics of mandarin speech is analyzed, existing technical roadmap of TTS system is discussed, and on this basis, a method of prosodic rule based waveform synthesis is proposed. Subsequently, the key algorithm on prosodic adjustment, such as PSOLA, the prediction algorithm of F0 contour using Fujisaki model, the duration of syllable and pause, is discussed. On the basis of partially revised Fujisaki Model, a prediction algorithm of the F0 contour is proposed with full attention of the pitch characteristics of the synthesis unit in high-frequency word library, which not only simulate F0 contour of a sentence accurately, but also make minor change to the synthesis unit. Then, an introduction of the function and design of the main module in our system is made from a system perspective, and also, the method of constructing a corpus is introduced. In order to improve the naturalness of synthesized speech, a high-frequency word speech database as well as the syllable database is constructed. Finally, from a procedural point of view, an introduction of the function, input and output of the main function in the code is made and several issues which needs considering in programming are analyzed. In addition, the performance of the TTS system is evaluated using MOS method, and the result shows that the synthesized speech is relatively good in naturalness.
Keywords/Search Tags:Mandarin, TTS, Prosodic Prediction, Prosodic Adjustment, PSOLA
PDF Full Text Request
Related items