Font Size: a A A

Optimal filtering and speech recognition with microphone arrays

Posted on:2002-03-16Degree:Ph.DType:Dissertation
University:Brown UniversityCandidate:Adcock, John EdwardFull Text:PDF
GTID:1468390011999006Subject:Engineering
Abstract/Summary:
Microphone arrays are becoming an increasingly popular tool for speech capture and may soon render the traditional desk-top or headset microphone obsolete. Unlike conventional directional microphones, microphone arrays are electronically steer-able which gives them the potential to acquire a high-quality signal (or signals) from a desired direction (or directions) while attenuating off-axis noise or talkers. Because the steering is done in software and not by a physical realignment of sensors, the number of simultaneously active targets is limited only by the available processing power and moving targets may be tracked within the receptive area of the microphone array. The applications for microphone array speech interfaces include telephony and teleconferencing, speech recognition and automatic dictation, and acoustic surveillance. To realize the promise of unobtrusive hands-free speech interface that microphone arrays offer, they must perform robustly in a wide variety of challenging environments.; This dissertation investigates the performance of speech enhancement algorithms applied to microphone arrays. A set of objective measures to be used in evaluating the effectiveness of speech enhancement techniques is presented including signal-to-noise ratio, Bark spectral distortion and a feature-distortion measure based upon the features of a speech recognition system and intended to reflect the performance of that system. A large set of microphone array recordings which include a time-aligned close-talking microphone reference channel is described. Baseline performance of a delay-and-sum beamformer on this data set is presented including the performance of an alphadigit speech recognition system. A multi-input filtering algorithm for signal enhancement is derived from the well known Wiener filter. Methods for signal power spectrum estimation are presented and a novel combination of standard methods is introduced. The performance of a distortion-less optimal microphone weighting, a Wiener post-filter, and two novel filter-and-sum implementations are evaluated and compared with the performance of the standard delay-and-sum beamformer.
Keywords/Search Tags:Microphone, Speech, Performance
Related items