Key Algorithm In High Quality Voice Conversion System

Posted on:2013-02-12

Degree:Master

Type:Thesis

Country:China

Candidate:Y Zhou

Full Text:PDF

GTID:2218330371957701

Subject:Signal and Information Processing

Abstract/Summary:

PDF Full Text Request

Voice conversion (VC) is a technique used in order to turn the personality characteristics of a speaker's (the source speaker) voice into another person's (the target speaker). Speech contains a lot of information, in which the most important is the semantic information, and another is the individuality information. The target of a VC system is to change or modify speaker's individuality while preserve the original semantic information, so that speech uttered by one speaker is transformed to sound as if it had been articulated by another speaker. This paper studies the key technology of the high quality VC system. The main work and contributions are described as follows:1. The VC system aims to transform voices. Moreover, the synthetic speech in the high quality VC system should be more natural and understandable. Studies of the model and parameters for speech signal analysis is done proceed from the model of pronunciation. This paper mainly researches the conversion methods especially the algorithm based on GMM models. The system is simulated, and evaluated by means of both objective and subjective tests.2. The traditional VC system often has unnatural conversion voice. Hence, in this dissertation, this paper improves it through change the time-scale of speech, which is operated with insert the converted parameters before and after each word. The results of the listening tests in which the naturalness and understandability of the converted voice are reported better than ever.3. In the VC system based on the improved algorithm proposed before, MFCC is adopted to be extracted as it is more beneficial for sound perception. The 3-D MFCC diagrams as well as waveforms of the voices before and after the conversion are given. The test results confirm that the transformed speech not only approximates the characteristics of the target speaker, but also more nature and understandable.

Keywords/Search Tags:

Voice Conversion, time-scale, Gaussian Mixture Model, Mel-Frequency Cepstrum Coefficient

PDF Full Text Request

Related items

1	The Research Of Extracting Of Pathological Voice's Characteristics And Recognition Based On Wavelet Transformation And Gaussian Mixture Model
2	GMM Voice Conversion System Based Time Length Changed
3	Research On The Voice Conversion System
4	Research On Technologies Of Voice Conversion Based On Gaussian Mixture Model
5	The Research On Restoration Of Throat Microphone Speech
6	Voice Conversion Based On GMM And Codebook Mapping
7	Study On Speaker Verification Technology Related To Text And Applications
8	Research On Speaker Recognition Based On Combination Of Features
9	Research On Methods For Voice Covnersion
10	Research On The Chinese Voice Conversion System Based On GMM