Font Size: a A A

Adaptive Acoustic And Speech Signal Processing Towards Real-Time Operation

Posted on:2006-08-15Degree:DoctorType:Dissertation
Country:ChinaCandidate:P Y HeFull Text:PDF
GTID:1118360182972721Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
The work in this doctoral thesis is mainly concerned with acoustic echo cancellation (AEC) and speech blind signal separation (BSS) towards real-time operation, which are hot research topics in recent years . Both AEC and BSS face a big challenge due to the very long room impulse response(RIR) of an acoustic propagation path and the non-stationarity and auto-correlativity of a speech signal. Normally, in a usual office, an RIR has a continued period of hundreds of milliseconds, and hence the corresponding FIR filter will have at least 1000-2000 taps with 8kHz sampling frequency. Therefore, it is a huge task for an adaptive filter either in AEC or in BSS to estimate thousands of coefficients used for describing a RIR. It gets even more challenging in real-time implementations which are often needed in audio signal processing like AEC and BSS. Furthermore, it gets much more difficult for AEC and BSS to work well when a relevant acoustic environment changes frequently. The performance of an algorithm involved in AEC and BSS is ususlly contradictory to its computational complexity. Aiming at the feasibility of real-time implementations, the work we are concerned with in this thesis is focused on figuring out the trade-off between them and building some new models for AEC and BSS as well as the corresponding computionally efficient algoritms. The followings discribe the four aspects of the main contributions of this research work in detail. Acoustic echoes emerge as a result of the coupling between a loudspeaker and a microphone within the same room. AEC is divided into single-channel AEC (SCAEC) and multi-channel AEC (MCAEC). SCAEC or MCAEC can be used to eliminate the contribution of a loudspeaker or multi-loudapeaker to microphone signals. Adaptive filters in AEC are used to identify echo acoustic paths or estimate their RIRs . Therefore, both SCAEC and MCAEC in real world will encounter three main difficulties: 1) the convergence rate of an adaptive filter is inclined to decrease due to the auto-correlativity of a speech signal; 2) the performance of AEC is going to be worse as the length of a filter increases; 3) the real-time operation becomes much more difficult because the computational load might rapidly increases as the length of a filter increases. As for SCAEC, facing the above three problems, we first analyze the characteristics of some typical adaptive algorithms used for SCAEC and compare their performances by simulation. Secondly, for real-time operation, we devolped two DSP real-time systems — acoustic echo canceller and acoustic RIR measurement system to verify the effectiveness of some algoritms. Experimental results show that both decorrelated LMS algorithm and fast block frequency LMS algorithm have the feasibility of real time applications. MCAEC not only encounters the same difficulties as SCAEC does, but also has to face a new challenge—non-uniqueness problem (NUP) of solution of an adaptive identifying filter. That means adaptive filters in MCAEC not only track the acoustic change in near end room , but also track the acoustic change in far end room. Therefore, the adaptive algoritms used for MCAEC must have a better performance in comparison with those for SCAEC. For MCAEC, we first analyze the crux of NUP, and then explore widely decorrelation-based methods to release algorithms used for MCAEC due to NUP. The proposed decorrelation-based methods are classified into three types. The first one is called amplitude perturbation method (APM); the second is called directly soft limiting on amplitude method (DSLAM); the third is time axis perturbation method (TPM).Experimental results show that DSLAM is a little bit better than APM, and TPM is much better than APM and DSLAM. It should be noted that the controlling parameters of TPM should be very small so as to keep the spatial information carried by multi-channel speech signals from far end. BSS deals with the problem of separating independent sources only from their mixtures while both the mixing process and original sources are unknown. In acoustic signal processing, BSS can be used to extract the individual speech signals from multiple microphone signals when several speakers are talking simultaneously. There are increasing applications of blind separation in teleconferencing in the future. However, the conventional separation algorithms can hardly be implemented in real-time due to the high computational complexity. The computation load is mainly caused by either direct or indirect estimation of thousands of room acoustic parameters.Therefore, developing an effective method with simply computational complexity will be of more and more significance. On the basis of this consideration, the research is first focused on how to simplify the normal convolutive mixing process of audio sources by exploiting the similarity of room acoustic propagation paths so as to reduce the computational complexity of BSS applications. Secondly, a simplified mixing model (SMM) based on the similarity is proposed. Its main advantage is only using the difference between two transfer functions for blind signal separation rather than these two transfer functions themselves. As a result, BBS process is simplified without performance loss. At last, an adaptive BSS algorithm based on a second order statistics (SOS) for SMM is proposed. The effectiveness of SMM and the proposed algorithm are confirmed by a real-time DSP system and simulation. For MCAEC, we figure out a novel method to deal with the NUP. Based on BSS, a multi-channel acoustic echo suppression model (MCAESM) is proposed. It avoids the inherent non-uniqueness problem in multi-channel acoustic echo cancellation due to strongly cross-correlated acoustic echo-source signals. On the contrary, it makes full use of the cross-correlativity to remove acoustic echoesfrom the convolutive mixtures picked up by microphones. In this model, only one additional microphone is needed to help separate and suppress these multi-channel acoustic echoes from each microphone signal. For the purpose of real-time processing, a new BSS algorithm based on second-order statistics, with less computational complexity and stronger robustness, is proposed to verify the effectiveness of MCAESM. In particular, the multi-channel acoustic echo suppression performance, without any cost, can be improved considerably by putting the additional microphone skillfully close to loudspeakers. Experimental results confirm the effectiveness of the proposed model and the proposed algorithm.
Keywords/Search Tags:Acoustic Echo Cancellation, Blind Signal Separation, Convolutive Mixing, Acoustic and Speech, Adaptive Algorithm, Real-Time Processing, DSP, Model
PDF Full Text Request
Related items