Font Size: a A A

The Research Of Voice Conversion Based On The Spectral Parameters Of Vocal Tract

Posted on:2016-10-11Degree:MasterType:Thesis
Country:ChinaCandidate:S Q YaoFull Text:PDF
GTID:2308330473465558Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
The basic idea of voice conversion technology is to change the characteristics of the source speaker so as to make it be similar to those of target ones while maintaining the linguistic content unchanged. In this paper, we focus on the voice conversion based on the spectral characteristics and the detail works are carried out as follows:Firstly, because Gaussian Mixture Model(GMM) omits the nonlinear relationship between speakers and is influenced by over-smoothing phenomenon so as to make the converted speech unsatisfactory, a novel mixed model comprised of GMM and Artificial Neural Network(ANN) spectral conversion method by using Radial Basis Function(RBF) neural network to transform the mean vector of GMM parameters in order to build the new transformation rule is proposed. Both objective and subjective tests indicate that the proposed method improves the performance of traditional voice conversion system and the quality of converted speech.Secondly, facing the over-smoothing phenomenon resulted from the GMM, multi-resolution wavelet analysis is adopted to the voice conversion, and the training feature vectors are classified into some disjoint clusters with fuzzy k-means clustering before the multi-resolution wavelet analysis so as to get an inaccurate transformation rule and the low training speed. The simulation results show that the proposed method can improve the articulation and intelligibility of converted speech under the circumstance of increasing the training speed.Thirdly, to solve the problems of low convergence speed, parameters being yielded minimal local results and bad generalization ability, which result from the traditional methods to train the RBF neural network, an adaptive particle swarm optimization based method is proposed to model voice features by training the RBF neural network in order to capture the spectral envelope mapping relationship between speakers. Both objective and subjective tests indicate that the proposed method can reduce the spectral distortion of converted speech and increase the similarity between the converted and target speech.
Keywords/Search Tags:Voice Conversion, Vocal Tract Spectrum Conversion, Gaussian Mixture Model, Artificial Neural Network, Fuzzy K-means Clustering, Wavelet Transformation, Adaptive Particle Swarm Optimization
PDF Full Text Request
Related items