Font Size: a A A

A New Lipschitz Generative Adversarial Network And Its Application In Voice Conversion

Posted on:2021-04-30Degree:MasterType:Thesis
Country:ChinaCandidate:Y B ZengFull Text:PDF
GTID:2518306017972879Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Many of insightful Generative Adversarial Network(GAN)variants have been proposed for improvements in both its model structure and loss function,achieving much competitive performance in various computer vision tasks.Nevertheless,research aimed at stabilizing GAN's training and improving the quality of generated samples is still ongoing,and further exploration of GAN's application in different tasks,such as voice conversion,is also a research problem of key importance.To solve problems of GANs based on Wasserstein distance,this thesis proposes a novel GAN's variants strictly satisfying the Lipschitz continuity,and test the effectiveness of the proposed model in both image generation and voice conversion tasks.The main research results are as follows:1.A new generative adversarial network based on a spectral bounding algorithm referred to as SBGAN is proposed,and the goal of SB algorithm is to effectively calculate a square root of the product of 1-norm and ?-norm as an upper bound of spectral norms of the discriminator to realize the Lipschitz condition.Experimental results reveal that the proposed SB method is conducive to the stability of GAN's training and provide a more reasonable restricted parameter space when compared to gradient penalty and spectral normalization.2.An image generation framework based on SBGAN is designed for investigating the effectiveness of the proposed model,and the network structure and loss function used in the framework are consistent with WGAN-GP and SNGAN models.Extensive experiments are conducted,showing that the proposed SBGAN model outperforms WGAN-GP and SNGAN on both CIFAR-10 and ImagetNet datasets in terms of the standard inception score.3.In order to further access the effect of the SBGAN model and tackle the limitations of existing voice conversion models,the thesis proposes a SBGAN-based voice conversion framework referred to as SBGAN-VC.The framework contains extraction and reconstruction of speech signal processed by STRAIGHT algorithm,and a spectrum transformation network included gated linear units and residual blocks.Extensive experimental investigations of speaker and emotional voice conversions are conducted,indicating that the proposed SBGAN-VC framework outperforms several state-of-the-art methods in terms of both subjective and objective evaluations...
Keywords/Search Tags:Generative Adversarial Networks, Lipschitz Continuity, Spectral Bounding, Voice Conversion
PDF Full Text Request
Related items