Font Size: a A A

Speech enhancement for robust speech communication

Posted on:2007-11-26Degree:Ph.DType:Thesis
University:The University of UtahCandidate:Deng, YingFull Text:PDF
GTID:2448390005478739Subject:Engineering
Abstract/Summary:
With more and more wide usage of speech communication systems, making such systems robust to background environment becomes important. Speech enhancement systems are front-end processing elements to provide enhanced speech for use in other applications. In this dissertation, a subband speech enhancement system is explored. Subband speech enhancement systems are composed of two parts---signal representation and signal estimation.;For signal representation, traditional methods use short-time Fourier transform (STFT). STFT-based methods analyze the signal with fixed time-frequency resolution once the analysis window is selected. However, speech signals are nonstationary in general. It is believed that multiresolution analysis/synthesis methods can better represent the nonstationarity of speech signals. Discrete wavelet/wavelet packet transform is a popular multiresolution representation method that has recently been explored in speech enhancement systems. Though multiresolution analysis of speech signals matches some aspects of human perception, the practical realization of a perceptually-tuned speech enhancement system using wavelet/wavelet packet transform has not yet been successful. This is due to the trade-offs between aliasing cancelation and low delay inherent in the transform. In this dissertation, I present a method for directly designing a multiresolution filter bank that has an independent subband aliasing cancelation property and at the same time achieves low delay.;For signal estimation, the independent Gaussian signal assumption is commonly used for the multiresolution signal components. Recent research results have shown that independent Gaussian might not be an appropriate assumption to make. This dissertation presents a time-varying nonlinear and non-Gaussian model for the subband sequences for signal estimation. This model results in an estimation problem that does not have any analytic solution. Particle filtering was developed for fullband speech enhancement systems as a numerical solution to nonlinear/non-Gaussian estimation problems that has no analytic solution. The use of particle filtering is extended to subband domain and for colored noise cases in this dissertation.;The presented subband speech enhancement outperforms its competitors and can be applied in real-time applications that have low-delay requirements.
Keywords/Search Tags:Speech, Dissertation
Related items