Voice Conversion Based On Improved Adaptive Training Using Non-parallel Speech Corpus

Posted on:2014-09-26

Degree:Master

Type:Thesis

Country:China

Candidate:C L Zhu

Full Text:PDF

GTID:2268330398464772

Subject:Signal and Information Processing

Abstract/Summary:

PDF Full Text Request

Voice conversion is a technique that modifies the input speech of one speaker (sourcespeaker), and makes it sound like that uttered by another speaker (target speaker).Traditional voice conversion algorithms are usually based on parallel speech corpus andjoint training, but it is difficult to obtain parallel data and inflexible to extend system inpractical application.This paper presents a non-parallel and non-joint training algorithm for voiceconversion using Universal Background Model (UBM) and Maximum a Posteriori (MAP)adaptation approach. First of all, extract the pitch frequency and short-time spectra of allspeech corpora by STRAIGHT, the corresponding LPCC parameters are obtained fromshort-time spectra next, and a UBM is trained from the LPCC parameters reflecting thespeaker-independent statistical distribution of features using non-parallel speech samplesof all speakers. Then with the UBM acting as the prior model, every speaker-specificmodel is derived by using new parameter estimation based on MAP adaptation. Finally thecorresponding transformation function is obtained.ABX and MOS experiments show that the proposed method achieves equivalentconversion performance comparing to traditional parallel corpus based method and hasmore flexible system extension ability.In this paper, the research content mainly includes the following aspects:1. Feature parameter analysis of voice conversion, including the vocal tract parameters andrhythm parameters etc., such as pitch frequency, short-time spectrum parameters, speechlength, etc.2. Realize the traditional voice conversion system using Gaussian Mixture Model with parallel corpus, and the characteristics and existing problems of the conventional methodhave been analyzed.3. Based on the non-parallel corpora, this paper proposed the improved adaptive trainingmethod to achieve voice conversion, and solved the main problems of traditionalconversion methods.4. Universal Background Model and speaker adaptation technology have been researched.Train independent speaker-specific GMM model using the maximum a posterioriprobability adaptation technology.5. STRAIGHT analysis-synthesis algorithm has been studied. Using STRAIGHT, realizethe pitch frequency and the short-time spectrum parameters analysis and the control of thesynthesized speech parameters such as the length in the time domain in the stage of targetspeech synthesis.6. Construct a voice conversion system based on the approach of UBM and MAPadaptation with non-parallel corpora. Finally, analysis and evaluation on the performanceof the system were given.

Keywords/Search Tags:

voice conversion, non-parallel corpus, non-joint training, UBM, MAP

PDF Full Text Request

Related items

1	A Study Of Voice Conversion System Based On GAN
2	Voice Conversion Based On CycleGAN Network Under Non-parallel Corpus
3	An Algorithm For Voice Conversion With Limited Speech Corpus
4	Nonparallel-Corpus-Based Multi Speaker Voice Conversion
5	Non-parallel Many-to-many Voice Conversion Based On Dynamic Convolution StyleGAN
6	Non-parallel Many-to-Many Voice Conversion Based On SE-ResNet Combining Speaker Embedding
7	The Research On Voice Conversion Algorithm Based On Improved Bilinear Frequency Warping For Parallel Or Nonparallel Corpora
8	Non-parallel Voice Conversion Using ACGAN And Variational Autoencoders Conditioned By Sentence Embedding
9	Neural Network Based Voice Conversion
10	Research On Any-to-many Voice Conversion Based On Non-parallel Data