Font Size: a A A

Research On The Extraction Of Voice Pitch Frequency Modes Based On Self-organizing Feature Map

Posted on:2012-09-19Degree:MasterType:Thesis
Country:ChinaCandidate:X FuFull Text:PDF
GTID:2218330338462967Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development of society, the computer has become one of the indispensable parts of human lives. Therefore, how to communicate with computer easily becomes an important problem to be resolved. Voice is one of the most efficient communication methods. So people hope to communicate directly with computer through voice. As a result, text to speech system developed very rapidly in recent years, and a large number of new technologies are emerging.Prosody mode is essential in the text to speech system. It converts the result of the text-analysis into acoustic parameters used for generating synthesized speech. The importance of prosody mode is obvious. So how to generate the prosody modes to reflect the prosody phenomenon closely is one of the most important components of a text to speech system. Extraction of pitch frequency modes is the basis of studying prosody mode. This thesis is about how to extract pitch frequency modes, the main achievements are as follows:In order to get pitch frequency sequences for clustering, the voice data needs to be modified in advance, by cutting syllable, marking pitch, adjusting length, smoothing, making the average to zero and so on.Two commonly used clustering algorithms are studied. We propose to use the self-organizing feature map network to be the clustering algorithm of extracting pitch frequency modes, because it is unsupervised,self-organizing and can be used for clustering. This algorithm overcomes some defects that other algorithms have.Choose a specific voice database as the experimental data, we extract 15 typical pitch frequency modes using the self-organizing feature map network, and give corresponding pitch frequency curves.After extracting the pitch frequency modes, decision tree method will be used to mining the rules of prosody model to simulate the voice synthesizer. This is the next step in the future.
Keywords/Search Tags:text to speech, pitch frequency modes, self-organizing feature map network, clustering
PDF Full Text Request
Related items