Font Size: a A A

Research On Audio Effects Detection And Semantic Analysis In Complex Audio Environment

Posted on:2012-06-21Degree:MasterType:Thesis
Country:ChinaCandidate:Z X MaFull Text:PDF
GTID:2178330335460293Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
The ability to identify sounds and analysis audio semantic in complex audio environments is highly useful for multimedia retrieval, security and many mobile robotic applications. It has become a hot issues in semantic content-based audio analysis and retrieval, but not yet solved well. Main difficulty is that in the complex audio environment, leading to changes in the audio effects within the different categories, the background noise easily submerged target audio effects, and there are a lot of confusing audio effects, the key audio effects in the proportion of the entire audio stream is small, the data imbalance of the lower precision of test results cannot meet the application requirements. In a complex audio environment for audio event detection and semantic analysis of the problem to be solved are:audio effects definition, distinguishing characteristics, model selection, detecting strategies.In this paper, audios was extracted for film, TV and other complex audio environment. This paper explored the audio effects detection and semantic analysis methods and techniques. The single audio effects detection, the key audio effects detection, unsupervised-based audio extraction and scene segmentation and other issues related to the theory and practice, focusing on audio effects modeling, detecting strategies, unsupervised method of application and imbalances and high-level semantics for data analysis-a preliminary of the scene segmentation and research. Therefore, the main text as follows:Pre-segmentation based single audio effects detection technology is the research foundation of audio effects detection and research. This Paper used SVM to detect the place where people spoke in news broadcast and used finite state machine to smooth the result. The precision and recall in test test achieved 81.25% and 82%. The author carried out a preliminary study imbalance data set classification and used down-sampling method to solve this problem. With that the ratio recognition rate of samples increased by 5.4%.The ability to identify sounds in complex audio environments is highly useful for multimedia retrieval, security and many mobile robotic applications, but very work has been done in this area. This paper took how to detect explosion in movie for example to provide the general key audio effects detection method in complex audio environments. This paper selected appropriate audio features to model audio effects by using Adaboost based decision tree model, with multi-level search strategy and smoothing method. The experiment showed that the method worked well, the precision and recall in test set achieved 78.4% and 81.2%.Supervised based audio effects detection method restricted by the training data, so it is a field related issues, limiting in promoting capacity. This paper tried to extract the audio effects without supervision methods and to explore high-level semantic, scene segmentation. This paper iteratively used magnitude-based factor spectral clustering to extract audio effects, and just like IF/IDF weights calculated in text analysis, the key audio effects gained. A rule was used to segment scene in audio stream. The experiment showed that the method worked well.
Keywords/Search Tags:audio effects, imbalance data set, detecting method, unsupervised learning
PDF Full Text Request
Related items