Font Size: a A A

Research On Initial-final Segmentation Of Chinese Based On Time-frequency Analysis

Posted on:2013-01-23Degree:MasterType:Thesis
Country:ChinaCandidate:D L HanFull Text:PDF
GTID:2248330371964857Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
As a front processing part in Chinese speech recognition or synthesis systems, initial-final segmentation plays as a key role, with its accuracy directly affect the following training model and even the whole system. Time-frequency analysis can analyze signals in time domain and frequency domain simultaneously, and has become a research focus in signal processing and other areas as a powerful non-stationary signal analysis tool. Meanwhile, it also brings in convenience on descripting the articulation differcens between initials and finals. Therefore, time-frequency analysis has an important theoretical significance and application value in Chinese speech study. Based on time-frequency analysis method, the isolated words initial-final segmentation has been studied in this article, and the specific researches are as follows:Firstly, segmentation based on Empirical Mode Decomposition with a combination with spectrogram was studied. Time-domain characteristics used in traditional segmentation methods are easily interfered by nosises, while the dynamic spectrogram has combined the characteristics of spectrum and time-domain waveform. Monosyllable segmentation with spectrogram was first simulated and proved to be efficient; on this basis, dual-syllable segmentation was implemented by combining the keeping-sign ratio feature improved with EMD. According to the experiments, feature extracting based on EMD improved feature nosise immunity and better maintained the differences between consnants and vowels.Second, monosyllable segmentation based on Matching Pursuits was proposed. In order to overcome the limitations of separate time-domain analysis or frequency-domain analysis so as to represent the differences between consonants and vowels in time domain and frequency domain simultaneously, MP sparse decomposition algorithm was imployed. Chinese syllable samples were represented by a group of gabor atoms which best matched the speech’s local time-frequency structure. Hence the gabor atomic parameters’diversification curves of each speech frame were acquired, and finally resulted in a new time-frequency segmentation method which was implemented with the help of the three gabor atomic parameters’ distribution characteristics.Finally, Chinese speech overlapping initial-final segmentation based on the "overlapping phoneme segmentation strategy" was studied. Considering the transition in Chinese syllable with the specific C+V structure, genetic matching pursuit was continued to be employed to avoid the subjectivity and arbitrariness of the "absolute" strategy and further more accuraetly identify the consonant part following the vowel starting point. And the consonant’s actual ending was identified according to the atomic parameters’variation on the speech signal’s transition section.
Keywords/Search Tags:initial-final segmentation, time-frequency analysis, spectrogram, matching pursuit, genetic algorithms
PDF Full Text Request
Related items