Font Size: a A A

Research Of The Chinese Voice Conversion System

Posted on:2009-06-29Degree:MasterType:Thesis
Country:ChinaCandidate:Y YangFull Text:PDF
GTID:2178360242489304Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
Voice Conversion is a technology that can change the speech characteristic of a specific speaker and make the transformed speech to sound as if another speaker had spoken it. This technology is an interdisciplinary field which come down to singnal processing, acoustics, linguistics, computer science, ect. The studies on it will promote the research of other sppech techologies. At the same time, voice conversion has a wild range of applications which include film or television dubbing, voice therapy and disguise, etc. So Voice Conversion technology has important research value both at theory and application.The dissertation analyzes the speaker's individual information featured by speech, and builds a voice conversion system by using a STRAIGHT-based algorithm based on Gaussian Mixture Model. Some experiments have been done to analysis the factors that affect the voice quality of the transformed speech.The major work of this dissertation contains the following parts:1. Analyzed the individual information expressed by acoustic features. The speech source features and vocal tract features are compared for different speakers by analyzing their glottl waveform parameters and the formant frequencies. The LSF coefficients, fundamental range and frequency are used to be conversion parameters.2. Built an Chinese Voice Conversion System using STRAIGHT-based algorithm based on GMM. In order to evaluate this voice conversion algorithm, we perform some subjective and objective experiments on speech quality. The over-smoothing problem of GMM has been pointed out, which degrades the quality of transformed speech.3. The differences in voice source features and vocal tract spectrum between the male and the female speakers are discussed. A voice conversion system is developed for male and female voice conversion by adopting a linear spectrum interpolate algorithm. The result shows that the conversed speech from female to male speakers has higher quality than that from male to female speakers. Developed a application software of Vocoder which can change the timbre of speech by modifying its pitch, spectrum and duration.4. Conducted some experiments and analysis to find the factors affecting the voice quality of the transformed speech. The factors include the amount and the type of training data, the number of the Gaussian mixtures and the difference bettwen source and target speakers.①The joint density estimation method which observes both the source and the target vectors has better performance than a previous approach.②The transformed syllables has higher quality than transformed sentences, when used the syllable training data; the transformed sentences has higher quality than transformed syllables, when used the sentences training data.③Transformed speech quality improved with an increase in training set size and GMM mixtures.④The informal listening tests demostrated that the female-to-male vioce conversion has higher quality than that of male-to-female; the female-to-female voice conversion has higher quality than that of male-to-male.
Keywords/Search Tags:Voice Conversion, STRAIGHT, GMM
PDF Full Text Request
Related items