Font Size: a A A

Audio Analysis Based On Content And Scene Recognition

Posted on:2014-02-02Degree:MasterType:Thesis
Country:ChinaCandidate:G Y WangFull Text:PDF
GTID:2308330482452247Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The research of audio analysis based on content and scene recogniton, as a newly research direction of multimedia field, is still being investigated and researched. Audio signal includes two types, structural signal and unstructural signal. The processing of structural singal has been mainly concentrated on audio research, such as speech recognition and music retrieval. The research of audio retrieval based on unstructural environment scene is still relatively small. How to extract audio sunmmary and the semantic content is the key of the research of audio retrieval based on content and has important theoretical value and practical application prospects.Existing audio retrieval algorithms have always worked on a specific type of audio, and there are substantial limitations on the composition of the audio structure, such as music retrieval algorithm based on similarity analysis. Audio retrieval based on supervised learning algorithm and unsupervised learning algorithm has its own limitations. This paper analyzes the advantages and disadvantages of those two audio retrieval algorithms and proposes a new audio retrieval method based on content and semantic understanding. Experimental results show that the method has a excellent effect for environment scene sound.In this paper, an audio segmentation method based on audio environment scene change is firstly proposed for the input audio data, and audio features are extracted for the audio segments. The similar audio segments clustered together by spectral clustering algorithm are treated as an audio event. Then, the background sound event and key audio evnets are checked out and all the audio events represented by domain feature vetctors are annotated by the similarity of the audio event with training audio event. Finally, a context model is proposed to correct the wrong calibration of audio segments in all audio events and a simple and smaller computational complexity scene recognition algorithm is built compared with the recognition algorithm baded on pseudo-semantic feature. This algorithm establishes an HMM model for each training scene and then implements the classification for test scenes.
Keywords/Search Tags:audio segmentation, audio event, scene recognition, spectral clustering, context model
PDF Full Text Request
Related items