Font Size: a A A

With Noisy Speech Coding Research

Posted on:2008-01-27Degree:DoctorType:Dissertation
Country:ChinaCandidate:H LiFull Text:PDF
GTID:1118360212498581Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
The mobile communication technology develops rapidly and the range of the speech communication is expended. The speech communication is often in the noise background and the speech signal will be corrupted. As to the parameter coding method, the speech parameters will greatly affect the quality of the speech coding. The study on extracting the pitch and the linear prediction coefficients in the noisy speech and the effective quantization coding methods and the noise reduction methods is very important for the research and applications.The pitch is the very important parameter of exciting source in the speech coding. A pitch detection algorithm based on AMDF AND ACF is proposed for the real-time applications. The computational expense of the algorithm is decreased. At first, AMDF values are computed by AMDF algorithm for a frame of speech signal. And then ACF values are computed by ACF algorithm for the AMDF values. In order to decreases computational expense and complexity, the AMDF values of the frame of speech signal are then transformed into one bit signals. The method can also decrease the effects of amplitude and formants the speech signal for pitch detection. The pitch period is calculated by ACF algorithm for the one bit signals. The multiplication operation for short-time autocorrelation function of the one bit signals is replaced by simple addition operation. A real-time pitch detector based on the field programmable logic arrays to meet the needs of the real-time pitch detection is proposed. The memories and gates and sequential circuits of Spartan II XC2S30 chip are used to implement these algorithms, which meets the needs of real-time pitch detector.The pitch of the noisy speech can not correctly be estimated when the SNR of the speech signal is low. A pitch detection method of noisy speech signals based on GCI and the discrete wavelet transform is proposed. The GCI position of the speech can be estimated by using the wavelet transform and then the pitch is calculated. The effects of the noisy signal and speech formants for pitch detection are decreased by the 3-order lowpass elliptic filter. The precision of pitch detection is increased and the algorithm decreases computational expense and complexity compared with the multi-scales wavelet transforms algorithm. It is difficult to extract the linear prediction coefficients from the noisy speech signal. A method of extracting the linear prediction coefficients from the noisy speech signal based on the spectral subtraction is proposed. The minimum statistics tracking method is used to evaluate the noise power spectrum because the energy and the frequency the noise are changed with the time. The speech signal power spectrum is extracted by using the spectrum subtraction and then the linear prediction coefficients are extracted. The experiments results show the method increases the corrective ratio of extracting the linear predictive coefficients.Quantization coding is very important for the parameter coding. The paper deeply studies the several normal methods of the vector quantization of the line spectrum frequency parameters. The method of the vector quantization based on Gaussian mixture models has computationally efficiency, low memory requirements, with its complexity independent on the rate of the system. The much information of the parameters spaces distribution can be described by the GMM quantizer. The computational expense and memory requirements are decreased and the quantization precise is increased by the nonlinear quantization method.The speech enhancement technology is used in the pre-processing section when the speech signal is seriously corrupted. A speech enhancement algorithm based on the spectral envelope and Kalman smoothing is proposed. According to the characteristics of the slow changes of the vocal tract parameters, the linear prediction coefficients are converted into the line spectrum frequency parameters and then these parameters of the current frame and previous frame are smoothed. The residual isolated noise is reduced. The quality of the enhanced speech is evaluated by means of segmental SNR and ITU-PESQ scores. Experimental results indicate that the proposed algorithm achieves obvious improvements compared with conventional Kalman smoother and Wiener filter algorithm.
Keywords/Search Tags:Research
PDF Full Text Request
Related items