Font Size: a A A

Design And Implementation Of A Voice Conversion Algorithm Based On Sinusoidal Harmonic Model

Posted on:2018-04-18Degree:MasterType:Thesis
Country:ChinaCandidate:M L LiFull Text:PDF
GTID:2348330518499068Subject:Engineering
Abstract/Summary:PDF Full Text Request
Voice conversion technology is an integrated product in certain stage of the development of the speech recognition and speech synthesis,and is an important branch in the field of speech signal processing.The purpose of voice conversion is to change the source speaker's speech feature parameters so that the conversion parameters of synthetic speech sounds like the target speaker issued.Its essence is the conversion of characteristic parameters.This technology covers almost all aspects of the field of voice signal processing and Its research and development have an important role in promoting speech synthesis,speech coding,speech enhancement and speech recognition.A voice conversion system consists of two stages: the training and the conversion stage.In the training stage,the parameter mapping rules are obtained.In the conversion stage,the conversion of the speaker's personality parameters using the mapping function obtained by training and then the speech signal is reconstructed according to the converted parameters.In general,a complete speaker voice conversion system generally needs to consider three factors: an effective analysis of the synthesis model,an ideal conversion rule and characteristic parameters representing speech personality.The results show that the sinusoidal model is a good parameter model.In this paper,a speech conversion system with sinusoidal harmonic analysis is designed and implemented based on the research of sinusoidal speech model.The text mainly includes the following aspects:(1)Research on speech analysis synthesis model.An analysis and synthesis algorithm based on the sinusoidal speech model is studied and the peak extraction module is improved.The new peak extraction algorithm enhances the adjacent two frame parameters so that the accuracy of the peak extraction is improved and the quality of the synthesized speech is also improved.(2)A simplified sinusoidal model which is sinusoidal harmonic model is studied and the original speech can be reconstructed well.Its purpose is to facilitate the training and conversion of speech feature parameters.firstly,the pitch frequency of the speech is estimated and then the harmonic amplitude and phase are estimated by the least squares method in the sinusoidal harmonic model.The pitch frequency is the main component of the rhythm characteristic of the speech signal which represents the characteristics of the excitation source and facilitates the conversion of the prosodic parameters.(3)Design and implementation of a voice conversion algorithm based on the sinusoidal harmonic model.In the training stage,the pitch frequency and cepstrum parameters of the source and target speech are extracted.The cepstrum parameters are modeled by joint probability density using GMM training and EM algorithm to estimate the model parameters in order to obtain the mapping rules of the spectral parameters.In the conversion stage,the cepstrum parameter converted according to the obtained mapping rules and the mean linear method is used to convert the pitch frequency.In the conversion synthesis stage,in order to improve the quality of conversion voice the interpolation of conversion parameters is used.(4)In order to test the conversion effect of the characteristic parameters,experiments were carried out respectively on the conversion of the pitch frequency and cepstrum parameters between male and female.(5)For testing the conversion effect and performance of the speech conversion system.The conversion speech is evaluated by the ABX of the subjective evaluation standard and the SNR of the objective evaluation standard.According to ABX test results,the conversion system designed in this paper can realize the complete conversion of prosodic characteristics and partial conversion of spectral envelope and the conversion voice quality is well.
Keywords/Search Tags:Sinusoidal Harmonic Model, Voice Conversion, Pitch Frequency, GMM
PDF Full Text Request
Related items