Font Size: a A A

Research On Analysis And Recognition Of Auditory Scenes

Posted on:2014-03-10Degree:MasterType:Thesis
Country:ChinaCandidate:L YangFull Text:PDF
GTID:2308330482951980Subject:Computer technology
Abstract/Summary:PDF Full Text Request
As an important channel of people’s perception of external environments, auditory sensation plays a larger and complementary role than vision in situations like the obstruction of sights and unfavorable illuminations. Compared to images, audio signals can be acquired using relatively simpler equipments and require less storage space and processing time. With the increasing computational capability of current mobile platforms, various novel and attractive applications based on audio processing have emerged and the development of the involved audio algorithms has been an important direction of relevant research areas. Moreover, the effective extraction, analysis and exploitation of the semantic information embedded in the audio data play a significant role in content-based multimedia retrieval, summarization and context-aware or context-adaptive applications.The thesis focuses on the method of automatic recognition or classification of auditory contexts based on their audio characteristics. Given one sample of audio signals from an unknown auditory context, we classify it to one of possible context categories. Two steps are adopted for this purpose:1) extract or construct the set of audio features that depict the intrinsic characteristic of the audio sample, including the Mel-frequency cepstrum coefficients (MFCC), pseudo-semantic features and the feature learning or selection on the basis of the wavelet packets decomposition of audio signals (such as local discriminant bases or boosting).2) construct appropriate auditory context models and classify the audio samples based on them. Variant models, including the mixture of Gaussians, the hierarchical hidden Markov model and the random forest model, have been investigated or proposed in the thesis, along with the detailed description of the property, composition and classification of these models and the representation of relations between audio effects and contexts.The thesis analyzes and compares the performance of variant proposed models by the experiments on the testing dataset, which consists of audio samples of 10 categories of auditory contexts and 21 categories of audio effects collected either from the Internet or some TVs/movies. The experimental results demonstrate the effectiveness of the proposed methods for audio feature extraction and context classification. The thesis finally summarizes the present work and discusses the possible directions and some considerations of further research.
Keywords/Search Tags:auditory scene, audio feature, context recognition, random forest, HMM
PDF Full Text Request
Related items