Font Size: a A A

The Research On Feature Parameters And Transformation Methods In Voice Conversion

Posted on:2016-02-21Degree:MasterType:Thesis
Country:ChinaCandidate:X T ChenFull Text:PDF
GTID:2308330473965539Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Voice conversion technology is aimed to transform two different speakers’ voices by personality traits. The target speaker’s voice feature is oriented, so that the changed voice will have the characteristics of target speaker. Thereby, the sound effect of the source speaker will be altered. This paper primarily studied the conversion methods of speech personality characteristic parameters to realize the effective use of speech parameters and enhancement of transformation effects. The main work is as follows:Firstly, the adjustment of the prosodic features, including pitch frequency and speech rate, is studied. while the conversion of pitch frequency is realized in this paper, a method that using Gaussian model for speech duration mapping, and using the achieved duration ratio for interpolation is proposed to adjust the target speech duration, so that the synthesized speech can be closer to the target speech in speech rate.Secondly, the conversion rule of vocal tract characteristic parameters is studied. The generalization property of Artificial Neural Network(ANN) contributes to the conversion of speakers’ feature, but a large number of hidden nodes in the network often leads to a complex network structure. Therefore, a conversion method for vocal tract characteristic parameters based on improved Radial Basis Function(RBF) neural network is proposed. In this method, the K-means algorithm is used for calculating the network central values, and the Particle Swarm Optimization(PSO) algorithm is for optimizing the number of hidden nodes, so that the fitting and conversion efficiency of RBF network for multidimensional nonlinear parameters can be improved. Besides, the similarity between the converted speech and target speech can be enhanced.Thirdly, the voice conversion system is further improved. The amount of extracted vocal tract characteristic parameters is commonly considerable, and for the same converting test speech, the conversion effects of different data segments are different. To take full advantage of the extracted parameters and achieve training data of smaller amount and strong characteristic, a module for pretreating the characteristic parameters is proposed to add to the system. After adding the module for adjusting the speech duration to the system, the voice conversion system is perfected and the quality of converted speech is improved.
Keywords/Search Tags:voice conversion, prosodic feature adjustment, vocal tract characteristic conversion, ANN optimization, GMM
PDF Full Text Request
Related items