Font Size: a A A

On Speech Enhancement Based On Microphone Arrays

Posted on:2008-04-04Degree:DoctorType:Dissertation
Country:ChinaCandidate:J R LinFull Text:PDF
GTID:1118360215950555Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
The signals received by Microphone Arrays (MA) can be processed not only in time-and-frequency domain, but also in spatial domain, compared with those received by singular or isolated microphones. Consequently, MA possesses capabilities of strong interference suppression, speech sources localization, tracking and separation, etc. For this reason, it has been proposed as a promising solution to excellent quality of speech communication in such applications as teleconference, hands-free mobile telephone and hearing aids. MA will definitely replace conventional desktop and head microphones in near future.The topic of this paper is how to enhance the desired speech sources via adaptive beamforming (ABF) techniques in MA. Since the performance of ABF is very sensitive to steering vector mismatches (e.g., look direction errors, imperfect array calibration, source local scattering and wavefront distortions), the problem of how to improve the algorithm's robustness is discussed in detial.In Section I, the background of this field is presented, including the theoretical architecture, the current researching status as well as the existing problems in this field. The organization and main contributions of this paper are presented also.In Section II, the General Signal Model of Microphone Arrays (GSMMA) is proposed, based on the spheric wavefront propagation equation. Compared with the conventional model, GSMMA no longer takes the narrowband assumption and the far field assumption for granted. Since there is no carrier in MA systems, speech should be treated as a very wide band signal. Moreover, since the speech sources may be very close to the array, i.e., within the near field of the array, the differences of the amplitude attenuation between adjacent channels should be taken into consideration. Obviously, the conventional model is a special case of GSMMA.In Section III, an approach of Doubly Weighted Broadband MUSIC algorithm (DWB-MUSIC) is presented, aiming at minimizing the variance of speech source localization errors. In this section, the traditonal 1-dimensional and narrowband MUSIC is extened to the multi-dimensional and broadband version first, making it suitable for MA systems. Then two sequential weighting operations are applied to improve the performance. (1) A weight matrix is imposed on the noise subspace when implementing MUSIC at each frequency bin. (2) Another SNR based weight vector is imposed on the original estimates at each bin, in order to calculate the final broadband source location. The former decreases the standard deviation of the narrowband estimates at each bin, which is usually caused by array perturbations. The latter weakens the negative effect of those estimates at low SNR bins on the final broadband estimation. The two weighting operations together improve the robustness of broadband speech source localization dramatically. A more precise localization will reduce the steering vector mismatch definitely.In Section IV, the robustness against source localization errors is ensured via a blind way and an approach of Broadband Deterministic Blind Beamforming (B-DBBF) is presented. It is based on the conventional narrowband DBBF algorithm via rotational invariance techniques. Utilizing the nonstationarity of broadband speech sources, the narrowband DBBF is extended to broadband scenarios and implemented in frequency domain. The nonstationarity of speech plays a key role in this extension. A special correlation-based channel rearranging (CR) operation is performed to cope with the problem of channel swap. Moreover, the problem of scale ambiguity is eliminated also by normalizing the norm of the weight matrices. As a result, the desired sources can be recovered without any scale distortion.In Section V, the methods with robustness against general steering vector mismatches is investigated. For the sake of simplicity, the robust adaptive beamforming (RABF) based on worst-case performance optimization (Worst-case RABF, W-RABF) is emphasized especially. It belongs to the class of diagonal loading approaches with the loading level determined based on worst-case performance optimization. A closed-form solution to the optimal loading is derived after some approximations. Besides reducing the computational complexity, it shows how different factors affect the optimal loading. Based on this solution, a performance analysis of the beamformer is carried out. As a consequence, approximated closed-form expressions of source-of-interest (SOI) power estimation and the output signal-to-interference-plus-noise ratio (SINR) are presented to predict its performance.In Section VI, a novel approach of RABF based on joint worst-case performance optimization (Joint Worst-case RABF, JW-RABF) is presented, aiming at robustness against both finite-sample effects and steering vector mismatches. JW-RABF is distinquished from W-RABF by taking the finite-sample effect into account and appling the worst-case performance optimization to not only the constraints, but also the objective of the constrained quadratic equation. Using the approximations similar to those in Section V, simple closed-form solutions to the optimal loading as well as the optimal weight vector are also presented. Compared with W-RABF, it achieves better robustness in the case of small sample data size. Moreover, combined with Frequency Focusing (FF) techniques, an approach of broadband JW-RABF is presented. Besides ensuring the frequency invariance of broadband beampattern, it possesses the capability of suppressing both correlated and uncorrelated interferences effectively.In Section VII, the work of this paper is summarized and a few important conlusions are drawn. At the end of this paper, a discssion about some possible future research work is also presented.
Keywords/Search Tags:Microphone array, Speech enhancement, Robust algorithm, MUSIC algorithm, Rotational invariance, adaptive beamforming, diagonal loading, Steering vector mismatches
PDF Full Text Request
Related items