A Study Of Voice Conversion System Based On GAN

Posted on:2020-10-16

Degree:Master

Type:Thesis

Country:China

Candidate:T Li

Full Text:PDF

GTID:2518306131466084

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

This thesis is devoted to the research of voice conversion system between specific people.The algorithm studied in this paper is based on non-parallel data to realize the conversion of the source speaker's voice to achieve the converted voice sounds like the target speaker without changing the text content of the audio by changing the personalized characteristics.Personalized speech conversion is a hot research topic in the field of speech signal processing.Voice conversion system is widely used in real life and has a very broad space for development.Most speech conversion are now based on parallel corpora between specific speakers,but parallel corpora are difficult to obtain in most cases and require alignment of feature sequences.In this thesis,I apply G~2GAN which is the algorithm of multi-domain image conversion based on non-parallel data innovatively to specturm converson in voice converiosn.And I redesign the whole network structure,generator,discriminator and domain classifier according to the difference between voice and image.In the experiment,I extract MCEP spectrum eigenvalue and fundamental frequency feature from the speech signal of source speaker and target speaker,and convert these two features respectively.I use the Gauss normalization method to convert the fundamental frequency and G~2GAN method to convert the spectrum.After conversion,speech synthesis is carried out to get the final converted speech.I compare my algrithm with Star GAN voice conversion algorithm which is also based on non-parallel corpus and GMM algorithm which is based on parallel data.The results show that my method is better than Star GAN and close to GMM.

Keywords/Search Tags:

Voice Conversion, Non-parallel corpus, StarGAN, G~2GAN, GMM

PDF Full Text Request

Related items

1	Non-Parallel Many-to-many Voice Conversion Method Based On Adaptive Trans-StarGAN
2	Non-parallel Many-to-Many Voice Conversion Based On SE-ResNet Combining Speaker Embedding
3	Non-parallel Many-to-many Voice Conversion Method Based On PSR-STARGAN
4	Research On Many-to-Many Voice Conversion Based On Multi-Scale StarGAN By Share-Learning For Non-parallel Corpora
5	Voice Conversion Based On Improved Adaptive Training Using Non-parallel Speech Corpus
6	Voice Conversion Based On CycleGAN Network Under Non-parallel Corpus
7	An Algorithm For Voice Conversion With Limited Speech Corpus
8	Nonparallel-Corpus-Based Multi Speaker Voice Conversion
9	Non-parallel Many-to-many Voice Conversion Based On Dynamic Convolution StyleGAN
10	The Research On Voice Conversion Algorithm Based On Improved Bilinear Frequency Warping For Parallel Or Nonparallel Corpora