Font Size: a A A

Research Of Chinese Tone Recognition Based On Time-frequency Analysis

Posted on:2014-02-17Degree:MasterType:Thesis
Country:ChinaCandidate:Z D XuFull Text:PDF
GTID:2268330401955010Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
Chinese is a tonal language, and tone is an important acoustic features. Tone in Chineseplays an important role of distinguishing meaning and word-building. It is a powerful mean ofsegmentation continuous speech. Tone also has very important significance in speechrecognition, speech synthesis and other researches. In this paper, the time-frequencydistributions of different tones of voices are studied; Time-frequency ridge of the tone featureinformation is extracted; Instantaneous energy of Hilbert-Huang transform based on LP isused to extract pitch. Specific works are as follows:(1) In view of the speech signal is non-stationary and time-varying, the different time-frequency distributions are studied for the speech with the different tone. The time-frequencyresolution, time-frequency clustering, cross term suppression, and computational speed arediscussed for spectra and time-frequency analysis of Cohen. The SPWD is proposed whichcan effectively suppress cross interference and has better time-frequency clustering. InSPWD’s time-frequency diagrams, if tones are same, the changes of the time-frequency ridgesare the same; if tones are different, the changes of the time-frequency ridges are different.(2) Time-frequency matrix is large, and the time-frequency information of tones mainlyembody in the time-frequency ridge’s changes. So Chinese tone recognition based on thefeature of time-frequency ridge is studied. In order to get a fine and clear time-frequencyridge, compare Time-frequency aggregation and computation time of SPWD with that ofRSPWD. SPWD, threshold and image thinning method are chosen to refine time-frequencyridge for computation time’s shorter. Then the Hough transform is used to extract ridge.However, some tone’s ridges are curves. Polynomial fitting based on the least squares methodis adopted to detect the ridge curve. Take ridge’s value and their first difference ascharacteristics of time-frequency ridge. The GMM is used to recognize and classify tones.This method identifies the tones effectively, and in the different SNR, using this feature torecognize tone can achieve good recognition results.(3) For instantaneous energy of HHT is affected by the formant frequency, pitchdetection has mistake. So the LP-based Hilbert-Huang transform is proposed to Chinese tonerecognition. Linear prediction analysis is taken to deal with speech signal and calculate thelinear prediction residual signal. The influence of the formant frequency is eliminated, and thelinear prediction residual retains complete periodic excitation information so thatinstantaneous energy of residual signal is the quasi periodic. Then autocorrelation of the linearprediction residual is adapted to extract the referenced pitch. EMD is used to smooth energyenvelope. Search the local maxima of the instantaneous energy according to the referencepitch, and the time interval between two neighboring local maxima is a pitch. This methodcan effectively extract the pitch. Take pitch and its first difference to GMM to recognize andclassify tones. This method can identify the tones effectively.
Keywords/Search Tags:tone recognition, smoothed pseudo Wigner-Ville distribution, time-frequencyridge, Hilbert-Huang transform, linear predictive
PDF Full Text Request
Related items