Speech enhancement algorithms using Kalman filtering and masking properties of human auditory systems

Posted on:2006-06-22

Degree:Ph.D

Type:Thesis

University:University of Ottawa (Canada)

Candidate:Ma, Ning

Full Text:PDF

GTID:2458390005496850

Subject:Engineering

Abstract/Summary:

Speech enhancement algorithms have been employed successfully in many areas such as VoIP, automatic speech recognition and speaker verification. Many approaches are presented in the literature. This thesis focuses on enhancing single channel speech degraded by white noise or colored noise. A Kalman filter algorithm combined with the masking properties of human auditory systems is proposed. The threshold computed from the masking properties is used as a constraint in the Kalman filter to theoretically derive a modified Kalman filter. The derivation gives a theoretical foundation for the feasibility of combining masking properties with a Kalman filter. Some heuristic methods are also proposed for an easier implementation. One algorithm proposes to use the frequency domain masking level as a hard threshold to reshape the Kalman filtered signal. Another algorithm is to use a post-filter concatenated with the Kalman filter, using a threshold where both time-domain and frequency domain masking properties are taken into account. The goal of the masking is to make the energy of the estimate state error smaller than the threshold. To further decrease the computational cost, a wavelet Kalman filter combined with masking thresholds is also introduced. In the above algorithms, the speech model is assumed to be linear. Nonlinear speech models are also considered in the thesis. To address the nonlinear model problem, dual Extended Kalman Filter (EKF) and dual Unscented Kalman Filter (UKF) algorithms are studied. In these cases, both time-domain and frequency domain masking properties are taken into account. The simulation results show that all the proposed methods combining Kalman filter and masking properties can produce promising results from the point of view of PESQ scores. The average PESQ score gains obtained by these proposed methods are from about 0.35 to 0.45. Some informal subjective tests also show that the performance of the proposed methods is promising. No voice activity detection is required in the proposed methods.

Keywords/Search Tags:

Kalman filter, Masking properties, Speech, Algorithms, Proposed methods

Related items

1	Speech Enhancement Methods
2	Speech Enhancement Based On Masking Properties Of The Human Auditory System
3	Research On Speech Enhancement Algorithms Of Microphone Array Based On Time-Frequency Masking
4	Research On Wideband Speech Enhancement Algorithms
5	Speech Masking Based On Artificial Synthesis
6	Research On Objective Measures Of Speech Quality Based On Masking Properties Of Auditory System
7	Research On Speech Masking System
8	Research And Improvement For Several Speech Enhancement Algorithms And De-noising
9	Wavelet Filter-bank And Psychoacoustic Modeling For Speech Enhancement
10	Research On Single Channel Speech Enhancement Applying Kalman Filter