Font Size: a A A

Time-Frequency Speech Presence Probability And Noise Power Spectrum Estimation In Noisy Environments

Posted on:2017-04-11Degree:DoctorType:Dissertation
Country:ChinaCandidate:C D XuFull Text:PDF
GTID:1108330503955256Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Speech Presence Probability and noise power estimation are the two critical problems for speech enhancement, which mainly determine the performance of noise reduction. These two problems are highly correlated since they are generally derived from a statistical model of the signal spectral power. This thesis focuses on the statistical model that can give the optimal solutions to the two problems.Conventional modeling methods are heuristic somewhat. The model parameter set was empirically estimated/updated, and especially, a few important parameters are given by experience. For these reasons, the parameter set cannot well adapt to the observed data, and the estimated parameters are incapable of being guaranteed to be statistically optimal. In addition, conventional modeling methods were conducted in a semi-supervised way, whereutterances were usually assumed to begin with non-speech signal. The non-speech signal can be regarded as the labeled samples to establish the non-speech model in a supervised way. Then, the statistical model is frame-wise updated based on the decision-directed method. Actually, this is a semi-supervised modeling method. However, numerous utterances may not begin with non-speech signal in reality. The assumption of non-speech beginning does not hold truth frequently. Therefore, the semi-supervised method is undesirable for real applications.Based on the above considerations, this thesis presents an unsupervised clustering optimal estimation method. The parameter set of the model is estimated based on the criterion of maximum likelihood. Accordingly, the outcome of speech presence probability and noise power estimation are guaranteed to be statistically optimal. The Gaussian mixture model(GMM) and hidden Markov model(HMM) are utilized as the templates for model-based clustering. Speech and non-speech clustering is regarded as the two components of the model. Clustering process is equivalent to the estimation of the model parameters in this paper. The result of the noise power spectrum is expressed by the clustering mean. Speech presence probability(SPP) are given by the statistical characteristics of clustering. Because the model-based clustering is conducted in an unsupervised way, the assumption on non-speech beginning is no required in the proposed methods. Therefore, the proposed methods are more practical than conventional methods in real applications. The main contributions and originality of this thesis are summarized as follows:1. The binary-state GMM is presented to model the subband log-power sequence in an off-line manner. The parameter set is estimated using the typical EM algorithm.2. The binary-state HMM is presented to model the subband log-power sequence in an off-line manner. HMM outperforms GMM in modeling the temporal correlation, where the log-power sequence is taken as a dynamic process which transits between speech and non-speech states. EM algorithm enables the temporal correlation adapt to the observed data.3. Based on the typical EM algorithm, the sub-optimal sequential scheme is presented to update GMM parameter set and output detection and estimation results frame by frame at the same time.4. An on-line likelihood function is presented for sequential HMM. Based on this likelihood, the Newton-Raphson iteration is utilized to frame-wise update the HMM parameter, and so the sequential scheme is guaranteed to be optimal.5. Based on the statistical characteristic of the log-power sequence, the constraints are presented for HMM/GMM, which enable the models to be reliable when the long-term speech presence is present.
Keywords/Search Tags:speech presence probability, noise power estimation, Gaussian mixture model, hidden Markov model, unsupervised learning, maximum likelihood estimation
PDF Full Text Request
Related items