Font Size: a A A

Voice Conversion Based On Improved GMM And Short-Time Spectrum With Prosody

Posted on:2009-08-10Degree:MasterType:Thesis
Country:ChinaCandidate:B ZhangFull Text:PDF
GTID:2178360245463629Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Speaker transformation algorithms aim to modify the utterance of a source speaker as if it was uttered by a target speaker, while preserving the original meaning. As a recent branch of speech signal processing, it has a broad application and will promote the research of speech analysis, speech coding, speech synthesis, speech enhancement, speech recognition and so on. In this dissertation, the main work is as follows:(1)In speaker transformation systems based on conventional Gaussian Mixture Model (GMM), speech quality of converted utterances is degraded by over-smoothing of the predicted spectrum. A conversion method using improved Gaussian Mixture Model was developed to alleviate the over-smoothing by taking account the frameā€“to-frame continuity and variations in the object function.(2) Bring forward a method of pitch conversion using short time spectrum with prosody features as characteristic vector. So that the frame-to-frame pitch changes are captured more fully and changes in spectral envelope and prosody are reflected simultaneously.(3) The entire speaker transformation system is implemented through MATLAB .And the experimental results are evaluated by both subjective and objective testing. Experimental results show that the algorithm can describe the personality characteristics and prosody features of the speakers more effectively. In addition, it could alleviate the too-smoothing phenomenon and improve the sound quality of transformed speech effectively, while changing the speaker's individuality.
Keywords/Search Tags:Voice conversion, Improved GMM model, Pitch, Prosody
PDF Full Text Request
Related items