Font Size: a A A

Voice Conversion Based On GMM And Codebook Mapping

Posted on:2016-12-10Degree:MasterType:Thesis
Country:ChinaCandidate:M M WangFull Text:PDF
GTID:2308330479997686Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Human voice contains a variety of information,Such as speaker’s semantic information and speaker’s personality characteristics information. Often need to use the knowledge in the field of speech signal processing to analyze and deal with the deep information contained therein. Voice conversion technology is not only a new branch of the speech signal processing in recent years, but also a research focus in the field of speech signal processing At present time. Personality characteristics information is a mainly starting point for the research of voice conversion technology. Voice conversion is a technology that change one speaker’s personality characteristics information and keep the semantic information Unchanged, so that it becomes another specific target speaker’s personality characteristics information. Research on voice conversion technology will help drive the other areas of speech signal processing development continually. Even play a Positive role for the research of smart home and artificial intelligence field that are very popular at present. It has broad application prospects and great theoretical research value. The main work of this paper as follows:From the speech production model, this paper introduces the mathematical model of sound system and common speech feature parameters, then the basic theory of voice conversion technology such as analysis/synthetic model were introduced. And proposed the KLD method to align the Source and target characteristic parameters, this method can reduced the nearest neighbor space between the source and target search space, and also can reduced the amount of computation.Studied the Gaussian mixture model and the vector codebook mapping these two spectral envelope conversion methods, and analyzed their strengths and weaknesses. the converted spectral are excessively smoothed by statistical Gaussian mixture model(GMM) algorithm. To solve this problem, the paper studied a method. In this method the codebook mapping method was used to correct both mean and covariance so that the dispersion degree of acoustic features is improved. Furthermore, the quality of conversion speech is improved.Study on the pitch frequency conversion method. At present, almost all of conversion method for pitch frequency modeled the pitch frequency and spectral parameters separately, which must affect the quality of conversion voice. In this paper, the RBF network algorithm was used for the conversion of pitch frequency. The method uses RBF network to establish a link between the pitch frequency and spectral parameters. So that the converted pitch frequency can track the fluctuation of target pitch frequency, and contain more information about the targeted speaker personality characteristics information.Take simulation test for the above improved methods. Subjective and objective testing show that the improved voice conversion method improved the quality of the converted speech, and thus get a better conversion effect.
Keywords/Search Tags:Voice Conversion, Pitch frequency, Gaussian Mixture Model, Spectral Envelope Transformation
PDF Full Text Request
Related items