An Algorithm For Voice Conversion With Limited Speech Corpus

Posted on:2019-12-30

Degree:Master

Type:Thesis

Country:China

Candidate:D Gu

Full Text:PDF

GTID:2428330572492960

Subject:Electronics and Communications Engineering

Abstract/Summary:

PDF Full Text Request

Voice signal contains a variety of information,such as the speaker's identity information,emotional state,and voice content.Voice conversion is a technique that uses the identity information of the target speaker to replace the source speaker's identity information without changing the language content.Voice conversion technology has broad application prospect in the fields of spoofing/anti-spoofing,artificial intelligence,restoring damaged speech,and speech interest interaction.However,the problems like a large number of sources and target speakers corpus are needed before the conversion and poor voice quality after conversion restrict the application of the voice conversion.Under the condition of limited target speaker's corpus,this dissertation proposed a voice conversion algorithm with limited corpus using unified tensor dictionary.Firstly,parallel speech of N speakers was selected randomly from the speech corpus to build the base of tensor dictionary.And then,after the operation of multi-series dynamic time warping for those chosen speech,N two-dimension basic dictionaries can be generated which constituted the unified tensor dictionary.During the conversion stage,the two dictionaries of source and target speaker were been established by linear combination of the N basic dictionaries using the two speakers' speech.The experimental results showed that when the number of the basic speaker was 14,our algorithm can obtain the compared performance of the traditional NMF-based method with few target speaker corpus,which greatly facilitate the application of voice conversion system.To deal with the problem of the low-quality voice caused by the ‘detail loss' in the sparse representation algorithm,this dissertation proposes a voice conversion algorithm based on the harmonic impulse separation.The algorithm is an improvement of the unified tensor dictionary(UTD)algorithm,and adds a preprocessing procedure of harmonic impulse separation.The harmonic and impulse signals are transformed by their respective conversion systems respectively,and the final conversion speech is added after the transformation.To settle the preprocessing separation,this algorithm trains the harmonic dictionary and the impulse dictionary during the training period.Due to the fact that conversion system adopts the voice spectrum as the conversion parameter,based on this,two improvement measures are proposed by this dissertation: spectrum compression and residual compensation.Experiment results show that this algorithm can effectively improve the voice quality of voice conversion algorithm,and can obtain high-quality voice conversion under the condition of few corpuses.Besides,the quality of the voice conversion by the proposed algorithm is higher than that of the Non-negative Matrix Factorization algorithm.Experiment results also show that the residual compensation can better improve the objective evaluation indicator of the conversion system,while the spectral compression plays a more important role in the subjective evaluation of the conversion performance.

Keywords/Search Tags:

Voice conversion, Limited corpus, Multi-DTW, Tensor dictionary, Harmonic Percussive Separation

PDF Full Text Request

Related items

1	An Algorithm For Voice Conversion With Noise Robustness
2	Nonparallel-Corpus-Based Multi Speaker Voice Conversion
3	Research On The Separartion Algorithm Of Music Instruments And Singing Vioce
4	Emotional Voice Analysis And Conversion Based On Parallel Corpus
5	Voice Conversion Based On Improved Adaptive Training Using Non-parallel Speech Corpus
6	A Study Of Voice Conversion System Based On GAN
7	Design And Implementation Of A Voice Conversion Algorithm Based On Sinusoidal Harmonic Model
8	Voice Conversion Based On CycleGAN Network Under Non-parallel Corpus
9	Age-Voice Conversion System Driven By Multi-Parameter
10	One-shot Voice Conversion Algorithm Design And Implementation Based On Representations Separation