With the popularization of computer and continuous development of multimedia technology, multimedia information has an increasingly huge influence on human life more. In order to quickly find what we want from huge amounts of multimedia information, audio signal should be firstly classified and then retrieved so that the retrieval efficiency can be improved. Although audio classification is a process of pattern recognition, it involves many other disciplines:statistics, digital signal processing, pattern recognition, speech recognition and neural network, etc. Audio classification has great application value and a broad prospect in such fields as effective multimedia signal encoding, automatic speech recognition, music genre classification and recognition, video conference, military and investigation.This thesis firstly states the development background and significance of audio classification and then classifies audio signals into mute, speech, music and voice with background on the basis of analyzing the audio classification technology both home and abroad. Sound principle of audio signal is studied. In order to reduce the influence of factors such as noise, the features of audio signal are pre-processed before extraction through the methods of the pre-emphasis and segmentation and window frame processing methods. Feature extraction can be divided into two types, namely extraction by means of time domain and frequency domain feature extraction. Time domain features refer to a short-time average energy, the zero rate, sub band energy ratio and spectral centroid, etc, while frequency domain features refer to a mute ratio, low frequency energy ratio, higher than zero rate ratio and low zero crossing rate ratio, etc.According to the time-varying characteristics of the audio signal, a cascade classifier is given based on hidden markov model (HMM) and support vector machine (SVM) model. The HMM model adopts Baum-Welch algorithm to calculate the parameters, whereas the SVM model uses vSVM algorithm parameters to calculate parameters. The method to design the cascade classifier is as follows:Based on the regular audio classification, if the audio is mute, the classification results are directly given; if not, the already trained HMM classifier is used to classify At the same time to calculate the maximum output signal probability and large output probability, will output the HMM maximum probability and maximum probability as the input of the SVM sample, use the corresponding SVM classifier to classify.Then simulation experiment is carried out on solo HMM classifier, sole SVM classifier and the cascade classifier of the HMM and SVM classifier. The classification ability of the three types of classifiers are compared after the analysis of the results obtained from the simulation experiments. It can be seen from the results that the method for feature extraction proves effective, that the cascade classifier has great classification ability and that the accuracy of audio classification is increased. Therefore, the classification algorithm presented in this thesis has certain reference value on algorithm research. |