Biologically-inspired noise-robust speech recognition for both man and machine

Posted on:2005-12-23

Degree:Ph.D

Type:Dissertation

University:University of Florida

Candidate:Skowronski, Mark D

Full Text:PDF

GTID:1458390008483586

Subject:Engineering

Abstract/Summary:

PDF Full Text Request

The purpose of this dissertation is to investigate biologically inspired techniques for increasing robustness of speech recognition, for both man and machine. This is accomplished in three regimes: the time domain, feature space, and the classifier. The human auditory system is an existence proof for accurate automatic speech recognition and has solved the principal complexities that currently plague machine recognition: conversational speech, noisy environments, and mismatched test/train data.; The three regimes, unique in their relation to biology as well as their role in recognition, demonstrate the efficacy of biologically inspired computation. In the first regime, human speech recognition is improved in the time domain using energy redistribution, a novel algorithm based on psychoacoustic experiments on the relative information density of typical speech across time. In listening experiments, energy redistribution was shown to decrease recognition error in noisy environments by 40% compared to the experiment control. In the second regime, the tradeoff between spectral resolution and local signal-to-noise ratio in the frequency domain is controlled by the novel speech front end called human factor cepstral coefficients (HFCC), created by combining the known relationship between critical bandwidth and frequency of the human auditory system with the filter bank design in a popular speech feature extraction algorithm: mel frequency cepstral coefficients (MFCC). In automatic speech recognition simulations of isolated words in noisy environments, HFCC outperformed MFCC by 7 dB. In the third regime, an emerging area in information processing, based on observations of the chaotic nature of biological sensory systems, is explored. A nonlinear dynamic system, introduced by Walter Freeman and colleagues, models the olfactory sensory system of rabbits and offers an alternative to conventional stochastic models used in automatic speech recognition. In the current dissertation, several critical aspects of Freeman's model are advanced, and the model is applied as an oscillatory network associative memory in static pattern classification experiments. Recognition accuracy of vowel phonemes using Freeman's model compares with optimum performance of a Hamming classifier.

Keywords/Search Tags:

Recognition

PDF Full Text Request

Related items

1	Research On Hand Feature Fusion Recognition Methods
2	Research On Mathematical Formula Recognition Algorithm And Its Application In Book Content Recognition System
3	The Research On Intelligent Recognition System Of Gate And Its Recognition Technology In Urban Railway Traffic
4	POI (Point Of Interest) Name Recognition And The Application In Dialog System
5	Traffic Video Identifies The Text Detection And Recognition Technology Research
6	The Research Of Fast Identity Recognition Algorithm Based On Iris Recognition Technology
7	Research On Seven Kinds Of Wireless Signal Simulation And Recognition Algorithm
8	Research On Pattern Recognition Methods Of GUI Objects In Automated Testing
9	Research On Radar Modulation Signal Recognition Technology
10	Gait Recognition Technology Research Based On Video Sequence