Font Size: a A A

Analysis And Generation Of Focus In Continuous Speech

Posted on:2014-05-13Degree:DoctorType:Dissertation
Country:ChinaCandidate:F B MengFull Text:PDF
GTID:1228330452453579Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Emphasis is an important feature of intonation. The generation of emphasis canimprove the naturalness and expressiveness of generated speeches. It has thewidespread application prospect in kinds of human computer interaction systems. Thedissertation analyzes the acoustic paterns of emphasis in continuous speech andstudies the modeling, the conversion and the synthesis of emphasis. The maincontributes are:1. This dissertation proposed a quantificative and modeling method of emphasisbased on local prominence. The dissertation firstly defined the feature of localprominence, which reflected the prominences of the acoustic features of the syllablesin local scope. The dissertation then built the quantitative model of the acousticfeature changes from neutral to emphatic speech, involving both prosody structurelocations and relative locations to emphasis based on the local prominences ofacoustic features. The quantitative model built the mathematical base of theconversion and the generation of emphatic speech.2. This disseration proposed the framework of synthesizing emphatic speech withlimit training data based on HMM. This framework included a two-pass decision treeto maintain the naturalness and the cost-based HMM selection algorithm and theparameter compensation algorithm to improve the emphasis intensity of synthesizedspeech. Based on the framework, the emphatic speech synthesis model based on thestatistical parameters and the emphatic speech synthesis model based on contextparameters in the decision tree are proposed. The experiments showed that thesynthesized speeches by the proposed models had both higher naturalness andemphasis intensity than the existing models.3. This dissertation proposed a conversion model from neutral to emphatic speechbased on local prominence. This model adopted a linear transformation matrix todescribe the correlation between the acoustic feature changes from neutral toemphatic speech and the local prominces of the acoustic features of neutral speeches. The dissertation proposed an English emphatic speech synthesis model supervised byconversion model. This dissertation extracted acoustic-feature-related labels ofneutral speeches using discretization method, designed acoustic-feature-relatedquestions for growing decision trees and built predicted-parameters-controlled HMMmodels. In synthesis time, the proposed emphatic speech synthesis model used theconversion model to predict the acoustic feature of emphatic speech, and then thesepredicted acoustic features were used to supervise the HMM models to synthesizeemphatic speech. As the HMM models were all trained with neutral corpus, thedemand of emphasis corpus was reduced.4. This dissertation proposed a nolinear parameter generation algorithm of emphasisin Mandarin. This dissertation analyzed the intonation chararistics in large-scalecorpus and found that the f0contour presented the downward trend over the course ofthe utterances. The dissertation analyzed the local prominences of acoustic features ofemphasis and proposed the nolinear mapping algorithm from the local prominence ofacoustic features of emphasis and the acoustic features of other syllables in theprosody phrase to the acoustic parameters of emphasis. A Mandarin speech synthesissystem supporting emphasis was built based on the proposed algorithm. Theexperiment showed that the proposed algorithm could generate emphasis and improvethe naturalness and expressiveness of the synthesized speech.
Keywords/Search Tags:acoustic feature analysis of emphasis, emphatic speechconversion, emphatic speech synthesis
PDF Full Text Request
Related items