Font Size: a A A

A Study Of Voice Conversion System Based On GAN

Posted on:2020-10-16Degree:MasterType:Thesis
Country:ChinaCandidate:T LiFull Text:PDF
GTID:2518306131466084Subject:Computer technology
Abstract/Summary:PDF Full Text Request
This thesis is devoted to the research of voice conversion system between specific people.The algorithm studied in this paper is based on non-parallel data to realize the conversion of the source speaker's voice to achieve the converted voice sounds like the target speaker without changing the text content of the audio by changing the personalized characteristics.Personalized speech conversion is a hot research topic in the field of speech signal processing.Voice conversion system is widely used in real life and has a very broad space for development.Most speech conversion are now based on parallel corpora between specific speakers,but parallel corpora are difficult to obtain in most cases and require alignment of feature sequences.In this thesis,I apply G~2GAN which is the algorithm of multi-domain image conversion based on non-parallel data innovatively to specturm converson in voice converiosn.And I redesign the whole network structure,generator,discriminator and domain classifier according to the difference between voice and image.In the experiment,I extract MCEP spectrum eigenvalue and fundamental frequency feature from the speech signal of source speaker and target speaker,and convert these two features respectively.I use the Gauss normalization method to convert the fundamental frequency and G~2GAN method to convert the spectrum.After conversion,speech synthesis is carried out to get the final converted speech.I compare my algrithm with Star GAN voice conversion algorithm which is also based on non-parallel corpus and GMM algorithm which is based on parallel data.The results show that my method is better than Star GAN and close to GMM.
Keywords/Search Tags:Voice Conversion, Non-parallel corpus, StarGAN, G~2GAN, GMM
PDF Full Text Request
Related items