Research On Analysis And Recognition Of Auditory Scenes

Posted on:2014-03-10

Degree:Master

Type:Thesis

Country:China

Candidate:L Yang

Full Text:PDF

GTID:2308330482951980

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

As an important channel of people’s perception of external environments, auditory sensation plays a larger and complementary role than vision in situations like the obstruction of sights and unfavorable illuminations. Compared to images, audio signals can be acquired using relatively simpler equipments and require less storage space and processing time. With the increasing computational capability of current mobile platforms, various novel and attractive applications based on audio processing have emerged and the development of the involved audio algorithms has been an important direction of relevant research areas. Moreover, the effective extraction, analysis and exploitation of the semantic information embedded in the audio data play a significant role in content-based multimedia retrieval, summarization and context-aware or context-adaptive applications.The thesis focuses on the method of automatic recognition or classification of auditory contexts based on their audio characteristics. Given one sample of audio signals from an unknown auditory context, we classify it to one of possible context categories. Two steps are adopted for this purpose:1) extract or construct the set of audio features that depict the intrinsic characteristic of the audio sample, including the Mel-frequency cepstrum coefficients (MFCC), pseudo-semantic features and the feature learning or selection on the basis of the wavelet packets decomposition of audio signals (such as local discriminant bases or boosting).2) construct appropriate auditory context models and classify the audio samples based on them. Variant models, including the mixture of Gaussians, the hierarchical hidden Markov model and the random forest model, have been investigated or proposed in the thesis, along with the detailed description of the property, composition and classification of these models and the representation of relations between audio effects and contexts.The thesis analyzes and compares the performance of variant proposed models by the experiments on the testing dataset, which consists of audio samples of 10 categories of auditory contexts and 21 categories of audio effects collected either from the Internet or some TVs/movies. The experimental results demonstrate the effectiveness of the proposed methods for audio feature extraction and context classification. The thesis finally summarizes the present work and discusses the possible directions and some considerations of further research.

Keywords/Search Tags:

auditory scene, audio feature, context recognition, random forest, HMM

PDF Full Text Request

Related items

1	Audio Analysis Based On Content And Scene Recognition
2	Research On Audio Scene Detection Method For Intelligent Mobile Terminal
3	Research On High-performance Audio Scene Recognition Technology
4	Research On Audio Feature Extraction And Context Recognition Based On Deep Neural Networks
5	Facial Expression Recognition Based On WMCBP-WWEF Feature Fusion Using Random Forest
6	Semantic Segmentation Of Street Scene Based On Random Forest Algorithm
7	Prediction Of Road Traffic Concentration Using Random Forest Algorithm Based On Feature Compatibility
8	Research On Audio Recognition Method Based On Residual Network And Random Forest
9	Research On Random Forest Algorithm Based On Feature Selection And Diversity
10	Study Of The Understanding Of The Spatial Layout And Semantic Segmentation Towards Traffic Scene Images