Font Size: a A A

Research On MFCC And GMM Speech Conversion Technology

Posted on:2016-02-13Degree:MasterType:Thesis
Country:ChinaCandidate:H ZhengFull Text:PDF
GTID:2208330461478150Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Voice is one of the most direct way of communication between people, so speech signal processing has been one of the most popular research direction for many years. Voice Conversion is a technology that can change the speech characteristic of a person into another one by Speech Processing, while speech information is not changed. Voice Conversion is an important branch of speech signal processing, which involves a number of disciplines, including multiple disciplines physiology, acoustics, signal processing. Besides, it has very broad application prospects in many fields such as multimedia dubbing, treatment of vocal organ damage and secret areas.This paper considers the design of a voice conversion system based on Mel Frequency Cepstrum Coeffient(MFCC) and Gaussian Mixture Model(GMM). Firstly, this article introduces the definition and the research value of voice conversion; then analyzes the two main steps of voice conversion: the extraction of characteristic parameters and the conversion functions. MFCC consider the auditory characteristics of the human ear, while the GMM is good at fitting the transfer curve, both of them are one of the best choices on voice conversation technology. Following we design a voice conversion system and divide the system into three parts:the extraction part, the training part and converting part. of Each part is to be described in detail; Finally, set a different test direction of the design and implementation. At the end we use Spectral Distortion and Mean Opinion Score to make a comment on conversion effect from both subjective and objective aspects.
Keywords/Search Tags:Voice Conversion, MFCC, GMM, MOS
PDF Full Text Request
Related items