Font Size: a A A

The Research Of Speech's Pitch Detecting And Modeling

Posted on:2008-07-18Degree:MasterType:Thesis
Country:ChinaCandidate:G W ZhangFull Text:PDF
GTID:2178360212493255Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Recently as the development of computer and digital signal processing technology, the human-computer technology has made a great progress, as well as the speech-synthesis technology, which is an important part of human-computer technology. Many approaches of speech-synthesis have been proposed. While the articulation of synthesized speech is satisfactory, its nature and rhythm is still not up to people's expectation. So it has always been a hotspot to find an effective way to stimulate the rhythm of natural speech.The rhythm characters of speech contain pitch, duration and margin, of which pitch is the most important. Accurate extracting of pitch contour is significant to speech signal processing. As an essential tool to analyze the rhythm of speech, it also has extensive application in speech-synthesis and speech recognition. It is the basis of the establishment of an effective pitch model which can enhance the nature of synthesized speech, as well.The thesis first sketches the research background of the topic. Then it elaborates the mechanism of the pronunciation and the mathematical model of the production of speech. It also discusses the features of speech both in time-domain and frequency-domain. Following this is the review of the existent pitch detection algorithms. It expounds the principles and processes of the domestic and abroaddominant pith detection algorithms------Autocorrelation, LPC, AMDF and Wavelettransformation, and briefly states the domestic and foreign actualities of pitch controlling and modeling. Further it elaborates the theory of wavelet transformation and its properties.The thesis proposes a new pitch detection algorithm based on the optimum scale of wavelet transformation. Conventional wavelet transformation based pitch detection algorithm outputs the pitch contour by comparing the positions of the wavelet coefficients' peaks of adjacent scales to locate the instants of glottal closure. Nevertheless, there are many false peaks when the scale of wavelet transformation is low, which reduces the algorithm's accuracy, and the search and identification of peaks in many wavelet scales lower the speed of the algorithm. The new algorithm in this thesis not only utilizes the wavelet transformation, but also takes advantage of the physiological limitations of articulators and the intrinsic characters of the pitch contour of speech. At begin, the algorithm gauges the optimum scale, and then extract the pitch contour by the analyzing of the wavelet transformation coefficients of this scale. The proposed algorithm can effectively eliminates the false peaks, which enhances the accuracy of the result, and needn't to search for the peaks in many scales shortening the duration of pitch detection, as well the new way of peaks' searching.Using the improved pitch detection algorithm, an experiment of pitch contour extraction is carried out based on a standard syllable speech data base. With regards to syllables of different tone, the thesis summarizes the classic pitch contours of them. On the ground of experiment's result, an advanced Target pitch model is proposed, and the expression of the model is specified. The new model has a more reasonable Target which is closer to the actual pitch contour than the former model. After the extraction of pitch contour, under the criterion of MSE, the parameters of the model are derived in the means of synthesis-based-analyze. Compared with the original model, the advanced model can produce more authentic pitch contours, which are more able to reflect the variation of natural speech. The validity of the new model is approved.In the end of this paper, we conclude all of our work. The problems to be solved and the orientation of our future research are also advised.
Keywords/Search Tags:pitch detection, wavelet transformation, optimum scale, peaks, Target model
PDF Full Text Request
Related items