Font Size: a A A

Tone Recognition Of Continuous Mandarin In Noise Environment

Posted on:2015-01-22Degree:MasterType:Thesis
Country:ChinaCandidate:C G LiuFull Text:PDF
GTID:2268330431450058Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Chinese is very different from English. The most significant difference is that Chinese is a tonal language but English is not. Tone as very important feature of the Chinese is implied in many research areas, such as speech recognition, speech synthesis and speech coding. It should be noted that, in this paper, we study the standard Mandarin. Tone models of isolated syllable are relatively stable, therefore, tone recognition of isolated syllables is relatively easy. However, tone recognition of continuous speech is not easy mainly because of co-articulation phenomenon which leads to tone model diversity. Traditionally, changed ones of each tone are modeled for pattern recognition, however, there is often overlapping between changed tone models of each tone. This is the leading reason why it is difficult for a lot of continuous tone recognition methods to improve accurate rate. Speech signal is inevitably polluted by noise which tends to affect tone detection. Therefore, it is necessary to perform speech enhancement as a pretreatment process of tone recognition. In this article, our main works and innovations are as follows:1. A new subspace-based speech enhancement method is proposedSubspace-based algorithm includes two processes:dimension estimation of signal and filtering process in the mixed subspace. Traditionally, subspace-based algorithm employs noise estimation to estimate signal dimension. However, it is unreasonable because most of the noise in the real environment is non-stationary. Here, we employ reconstruction error function to estimate the dimension of signal. The main principle is to calculate the reconstruction error based on principal component analysis (PCA) method; the minimum of the error function corresponds to the optimal reconstruction of signal; finally, the signal dimension is estimated. Considering noise is non-stationary, a tracking algorithm is employed to adaptively estimate noise. Theoretically, it is effective to remove noise by using subtraction in the mixed signal subspace. This method is similar to the spectral subtraction speech enhancement method. In fact, this method doesn’t work well. Therefore, we use the wiener filter algorithm instead of subtraction algorithm to remove noise in the mixed subspace. Experimental results show that the proposed subspace-based algorithm can effectively enhance speech.2. A sparse-based speech enhancement method is proposedSpeech signal is approximately sparse, and most of its energy is present in the low frequency. Here, we design a compound sparse dictionary by combining with these characters for speech enhancement. A sparse dictionary is employed to describe the low-frequency part of speech, and a fixed dictionary for the high-frequency portion of speech. It is very necessary to use a fixed dictionary to describe the high frequency components of speech which cannot be ignored. Experiments show that this method is very effective. However, this method doesn’t work well for all cases. When the signal-to-noise ratio (SNR) is low or high, its performance decreases rapidly. We believe that this phenomenon is mainly caused by obvious difference between speech and noise. In order to improve its performance, noise is considered as a sparse signal. Then, noise and speech are described by compound dictionary. Experimental results show that this algorithm can effectively improve the performance of above speech enhancement algorithm in low or high SNR.3. Tone recognition of continuous Mandarin with context informationTraditionally, continuous tone recognition algorithms don’t take overlapping phenomenon between tone templates into consideration. For our tone recognition method, tones of continuous Mandarin are divided into four tone models, and a fuzzy-based algorithm performs tone pre-recognition. Combined with the mutual influence of adjacent tones, tones are predicted based on known continuous tone sequences and form tone dictionary. Finally, continuous tone recognition result is obtained based on the result of tone pre-recognition and tone dictionary. In order to verify the influence of tone pre-recognition algorithm, here, we also use the support vector machines and time warping template matching algorithm to classify tones. Experimental results show that our tone recognition algorithm is superior to the traditional algorithms, and this conclusion is not affected by the tone pre-recognition algorithm. Relative to unsupervised recognition algorithm, supervised algorithm is more stable and effective.
Keywords/Search Tags:continuous speech, tone recognition, speech enhancement, sparse, subspace
PDF Full Text Request
Related items