Font Size: a A A

SVM-based Ensemble Learning Audio Classification Algorithm

Posted on:2008-02-10Degree:MasterType:Thesis
Country:ChinaCandidate:Q X SunFull Text:PDF
GTID:2178360212996412Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
The development of multimedia technology and Internet bring people to a vast sea of information, which further brings the advent of ultra large multimedia database. It's hard to retrieve multimedia information just by searching key words, so we need an effective searching way special for multimedia information, which is the major problem facing multimedia. Under such circumstance, the"content-based retrieval"makes its appearance. It is a new technology to retrieve information by searching the content of target, the feature of context, to be specific, first analyze the color, texture of the picture, scene or fragment, then extract their features, and then make similar match based on the features.Due to the complexity of Audio, it's more difficult to retrieve compared with image and Video. Besides the limited registration messages like sampling rate, quantitative accuracy, coding, the original audio data itself is merely a semantic symbols and unstructured binary flow. It lacks semantic description of the content and structuralized organization, and features high degree of Information correlation, complex data structure, large volume of data, and demanding requirements of processing. All above brings great difficulties to the in-depth processing and analysis to Audio information and put on restrictions on audio retrieval and content filtration. How to extract the semantic structure from audio information and content. Audio data changes from disorder to order is the key of Audio Information depthly analysis, content-based retrieval. The audio classification technology is the key technology to the solve this problem, and is the basis of audio structure. Audio Classification achieve the Audio structure in a certain degree, To achieve a higher level semantic structures provide a foundation for audio content. The results of the classification are with the semantics, but this level of semantic is commonly, weakening. Users may have a interest in higher level of semantic audio content (such as a section of audio broadcasting news or a conversation in the drama), the results of the classification provide the association to achieve the higher-level semantics of Audio structure and establish the audio lower structural unit and higher-level semantic structure unit provide the foundation.The extraction and analysis of the audio feature is on the basis of audio classification, Selected should be able to fully express the characteristics of sound frequently domain and time domain classification of the important characteristics. It must have robustness and commonness on the change on environment.There are several typical algorithms on design and implementation of the audio classification algorithm:Rule-based audio classification. The basic idea is: selecting the appropriate characteristics that can be used to identify certain types of audio, and then set a threshold of the features. Under the prior rules of engagement, compare the actual calculation with the threshold to identify audio category. This method is simple, but also because of its simple, applicable only to identify of a simple characteristics audio types.Minimum distance classification algorithm. The classifier using template matching idea to establish a template for each type of audio. The calculation of the actual audio vector feature is used to matching templates to identify the types of audio.Audio classification algorithm is based on statistic. Audio Classification is the focus of the study. It automatically and self-study provides a breakdown of the effective ways to achieve, is the main current and future direction of research in this field. Early Audio classification algorithms are based on statistical classification mainly concentrated in the use of neural network algorithm. In recent years, with artificial intelligence, machine learning areas of rapid development, more and more researchers will be hidden Markov model, K-Nearest Neighbor algorithm and Gaussian mixture model which used statistical learning algorithms applied to the audio classification study. Support Vector Machine is a statistics-based learning algorithm. The structural risk minimization principle as a basis for classification of SVM. This method started in the late 1990s by Vapnik. It has good generalization, and it can achieve better accuracy in the small sample of cases. SVM has an application in the audio category. But not analyze the parameters, only applicable to a value of experience. Cross-validation method that allows each structure through the classifier can receive the best training institutions create parameters, thereby improving the classification performance, but such an exhaustive approach is the way to greater consumption of time. difficult to meet the requirements of practical application.Ensemble Learning also known as ensembles of classifiers, it is a new classification framework Ensemble, will be the outcome of learning more than one way to integration to improve forecast accuracy. Compare with a single algorithm, ensemble learning do not appear the phenomenon of over-fitting study. Recent years, as ensemble learning can significantly improve the generalization ability of learning, Machine learning, neural networks, statistical researchers and many thesefields researchers are paying much attention to the ensemble learning, made as a relatively active in the field of research, and considered as the four major research directions field of machine learning.This paper presents a SVM-based weak classifier ensemble learning algorithm, compared with the support vector machine algorithm that used Cross-validation methods. Our algorithm achieved good results, No reduction in the rate of correct classification at the same time reduce the training time. The research work and results of this dissertation can be concluded as follows:1. Introduce the short-time technique to deal with audio. And analysis audio semantic content, Some structure units in different level of audio structure are defined. Introduce the postulate of audio classification particularly, give the basic flow.2. Audio features are researched deeply in frame level and clip level, and abstract the feature respectively, also get the feature vector by the weighted technique. SVM training algorithms and SVM classification constitution are researcher deeply, and give the method which can comfirm the best parameters used the technique that named Cross-validation and the Grid-search.3. The effect of the RBF kernal's parameters are researched deeply, this dissertation used this effcet on the ensemble learning algorithm, and get a new method of the classification constitution. Classification algorithm is the important part of the implement in audio classification, this dissertaion tell us that this new method's validity use the result of the practical experiment data.
Keywords/Search Tags:audio classificaion, audio feature abstract, SVM, ensemble learning ESL-SVM
PDF Full Text Request
Related items