Auditory-Based Noise-Robust Audio Classification Algorithms

Posted on:2009-11-21

Degree:Ph.D

Type:Thesis

University:McGill University (Canada)

Candidate:Chu, Wei

Full Text:PDF

GTID:2448390002490646

Subject:Engineering

Abstract/Summary:

The past decade has seen extensive research on audio classification algorithms which playa key role in multimedia applications, such as the retrieval of audio information from an audio or audiovisual database. However, the effect of background noise on the performance of classification has not been widely investigated. Motivated by the noise-suppression property of the early auditory (EA) model presented by Wang and Shamma, we seek in this thesis to further investigate this property and to develop improved algorithms for audio classification in the presence of background noise.;To evaluate the performance of the above FFT-based spectra, speech/music/noise and noise/non-noise classification experiments are conducted wherein a support vector machine algorithm (SVMstruct) and a decision tree learning algorithm (C4.5) are used as the classifiers. Several features are used for the classification, including the conventional mel-frequency cepstral coefficient (MFCC) features as well as DCT-based and spectral features derived from the proposed FFT-based spectra. Compared to the conventional features, the auditory-related features show more robust performance in mismatched test cases. Test results also indicate that the performance of the proposed FFT-based auditory spectrum is slightly better than that of the original auditory spectrum, while its computational complexity is reduced by an order of magnitude.;Finally, to further explore the proposed FFT-based auditory spectrum from a practical audio classification perspective, a floating-point DSP implementation is developed and optimized on the TMS320C6713 DSP Starter Kit (DSK) from Texas Instruments.;With respect to the limitation of the original analysis, a better yet mathematically tractable approximation approach is first proposed wherein the Gaussian cumulative distribution function is used to derive a new closed-form expression of the auditory spectrum at the output of the EA model, and to conduct relevant analysis. Considering the computational complexity of the original EA model, a simplified auditory spectrum is proposed, wherein the underlying analysis naturally leads to frequency-domain approximation for further reduction in the computational complexity. Based on this time-domain approximation, a simplified FFT-based spectrum is proposed wherein a local spectral self-normalization is implemented. An improved implementation of this spectrum is further proposed to calculate a so-called FFT-based auditory spectrum, which allows more flexibility in the extraction of noise-robust audio features.

Keywords/Search Tags:

Audio, Auditory, Proposed, Features, Further

Related items

1	Biologically inspired auditory attention models with applications in speech and audio processing
2	Research And Implementation Of Audio Fingerprint Algorithm Based On Auditory Mechanism
3	The integration of audio into multimodal interfaces: Guidelines and applications of integrating speech, earcons, auditory icons, and spatial audio (SEAS)
4	The Study, Based On Human Auditory System In The Transform Domain Audio Watermarking Algorithm
5	Improvement And Research On Audio Digital Watermarking Algorithm Based On Auditory Frequency Masking
6	Research On Digital Audio Information Hiding Technology Based On Transform Domain
7	Multiple Audio Signal Separation And Identification Technology Research
8	Monaural Speech Segregation Based On Computational Auditory Scene Analysis
9	Research On Analysis And Recognition Of Auditory Scenes
10	Research On The Development Of Audio Media Under The Background Of Mobile Internet