Font Size: a A A

Voice Conversion Based On Improved Adaptive Training Using Non-parallel Speech Corpus

Posted on:2014-09-26Degree:MasterType:Thesis
Country:ChinaCandidate:C L ZhuFull Text:PDF
GTID:2268330398464772Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Voice conversion is a technique that modifies the input speech of one speaker (sourcespeaker), and makes it sound like that uttered by another speaker (target speaker).Traditional voice conversion algorithms are usually based on parallel speech corpus andjoint training, but it is difficult to obtain parallel data and inflexible to extend system inpractical application.This paper presents a non-parallel and non-joint training algorithm for voiceconversion using Universal Background Model (UBM) and Maximum a Posteriori (MAP)adaptation approach. First of all, extract the pitch frequency and short-time spectra of allspeech corpora by STRAIGHT, the corresponding LPCC parameters are obtained fromshort-time spectra next, and a UBM is trained from the LPCC parameters reflecting thespeaker-independent statistical distribution of features using non-parallel speech samplesof all speakers. Then with the UBM acting as the prior model, every speaker-specificmodel is derived by using new parameter estimation based on MAP adaptation. Finally thecorresponding transformation function is obtained.ABX and MOS experiments show that the proposed method achieves equivalentconversion performance comparing to traditional parallel corpus based method and hasmore flexible system extension ability.In this paper, the research content mainly includes the following aspects:1. Feature parameter analysis of voice conversion, including the vocal tract parameters andrhythm parameters etc., such as pitch frequency, short-time spectrum parameters, speechlength, etc.2. Realize the traditional voice conversion system using Gaussian Mixture Model with parallel corpus, and the characteristics and existing problems of the conventional methodhave been analyzed.3. Based on the non-parallel corpora, this paper proposed the improved adaptive trainingmethod to achieve voice conversion, and solved the main problems of traditionalconversion methods.4. Universal Background Model and speaker adaptation technology have been researched.Train independent speaker-specific GMM model using the maximum a posterioriprobability adaptation technology.5. STRAIGHT analysis-synthesis algorithm has been studied. Using STRAIGHT, realizethe pitch frequency and the short-time spectrum parameters analysis and the control of thesynthesized speech parameters such as the length in the time domain in the stage of targetspeech synthesis.6. Construct a voice conversion system based on the approach of UBM and MAPadaptation with non-parallel corpora. Finally, analysis and evaluation on the performanceof the system were given.
Keywords/Search Tags:voice conversion, non-parallel corpus, non-joint training, UBM, MAP
PDF Full Text Request
Related items