Font Size: a A A

Research On Underdetermined Convolutive Speech Signal Separation Methods

Posted on:2011-01-28Degree:MasterType:Thesis
Country:ChinaCandidate:B Q LiuFull Text:PDF
GTID:2178330332964060Subject:Microelectronics and Solid State Electronics
Abstract/Summary:PDF Full Text Request
In the speech signal processing area, separate and extract the original speech from observed speech mixtures, has become a hot and difficulty point. In the mean while, this is also a significant direction in speech signal processing, and it has very active signification to speech recognition and speech enhancement.Blind source separation is to recover the unknown independent sources from several observed mixed signals according to the signal's statistical characteristic without any knowledge of the sources and channels. Because of its difficulty and practicality, it has became the most concerned methods to fulfill speech separation.In most studies, the most of nowadays'speech source separation methods need the number of observed signals is not less than that of source signals. However, this restriction is not suitable for some practical situations. In reality, there are many underdetermined instances that the number of observed signals is less than the number of the source signals. In the practicality, the impact of the environment factors which influence signal transmission also must be taken into account, it brings about delayed effects, in the mathematical representation is the convolution. Therefore, the search for blind speech separation methods for underdetermined convolution model has very important practical significance.This dissertation mainly brings forward three types of blind speech separation for underdetermined convolutive model:1) A blind speech source separation method is proposed via combining fast independent component analysis (FastICA) and adaptive nonlinear binary time-frequency masking. By estimating nonlinear binary masks from the outputs of a FastICA algorithm, it is possible in an iterative way to extract basic speech signals from a convolutive mixture. The basic signals are afterwards improved by the masks merging. The stereo property of the extracted speech signals can be maintained. The simulation results demonstrate that the proposed separation method outperform DUET and BLUES methods. The signal-noise-ratio gain of the results is great improved.2) A blind audio source separation method is proposed via nonnegative matrix factorization(NMF). Each source STFT is given a model inspired from nonnegative matrix factorization(NMF) with the Itakura-Saito divergence, which underlies a statistical model of superimposed Gaussian components. Expectation-maximization (EM)algorithm was used to obtain the parameter and reconstruct the signal. Our decomposition algorithms are applied to blind stereo audio source separation, The simulation results demonstrate that the proposed separation method is validity.3) A blind audio source separation method is proposed via fast relative-newton method and Smoothing Method of Multipliers (SMOM). This approach make use of the sparseness of speech signals and the independent characteristics of speech signals . Use the fast quasi-newton method, it greatly simplify the Hessian matrix, and it also greatly increases the speed of operation. Incorporating Lagrange multiplier into a smooth approximation of max-type function, we obtain an extended notion of augmented Lagrangian. Convergence of the method is further accelerated and it doesn't increase the problem's dimension. We demonstrate efficiency of this approach on many examples of blind speech separation.
Keywords/Search Tags:blind speech separation, nonlinear binary time-frequency masking, non-negative matrix factorization, fast relative-newton method
PDF Full Text Request
Related items