Font Size: a A A

Research On Key Issues Of Audio Event Detection And Classification For Complex Audio Documents

Posted on:2013-01-15Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y LengFull Text:PDF
GTID:1228330374499774Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Audio information, as an important source for human perception and communication, is becoming more and more important in human’s daliy lives, the application of audio information largely depends on audio detection and classification technology. This dissertation concerns with two key issues of audio event detection and classification for complex audio documents, and has done lots of researches on them. The specific work and innovations of this dissertation can be summarized as follows:1)A BIC based initial training set selection algorithm for active learningThe labeling of training samples is very expensive, in order to resolve this problem, this dissertation adopts active learning to reduce manual labeling workload. The initial training set selection is an important part of active learning, it has a great influence on the convergence rate, the current initial training set selection algorithms would make active learning fail to work or perform poorly when detecting events of small probability since they could not select or select too few samples of the small probability events, to solve this problem, this dissertation proposes a BIC based initial training set selection algorithm for active learning from the two aspects of representativeness and coverage character. Experiments show that the proposed algorithm not only can efficiently solve the detection problem of small probability events, but also has obvious advantages in detecting events of non-small probability.2) Margin and misclassification character based SVM active learningAmong SVM active learning algorithms, the most classical one is the algorithm that queries the sample closest to the current hyperplane in each iteration, in this dissertation, we call it margin based sampling SVM active learning, but when the current hyperplane is far away from the true hyperplane, the sample closest to the current hyperplane is less informative, at this time, selecting the sample merely according to its distance to the current hyperplane is not optimal. Considering that class boundary samples are easier to be misclassified, so combining margin information and misclassification character to select samples can increase the probability of selecting the true boundary support vectors, so this dissertation proposes a margin and misclassification character based SVM active learning algorithm, this algorithm is to query the sample that is close to the current hyperplane and meanwhile is more likely to be misclassified in each iteration. Experiments show that this algorithm can further reduce the manual labeling workload.3) Combine active learning and semi-supervised learning to detect audio eventsSince SVM is only interested in class boundary samples, most SVM active learning algorithms try to find boundary samples by different criterions, while have neglected the usage of the rest samples. Considering the relative relationship between class center and class boundary, this dissertation trys to use class center to better reflect class boundary, so it uses active learning to select boundary samples, and meantime uses semi-supervised learning to select class central samples from the rest samples, thus to form an active semi-supervised SVM algorithm. Experiments show that compared to active learning, the proposed active semi-supervised algorithm can further reduce manual labeling workload.4) Audio event classification strategy for complex audio documentsIn complex audio documents, different audio events would have temporal overlaps, this would cause the classification performance decrease greatly when using the current audio classification technologies to classify audio events for complex audio documents. This dissertation does researches on the classification problem of complex audio documents by taking film audio documents as the research object, and has proposed a pure model and clustering information based classification algorithm, this algorithm explores samples’prior distribution information to increase the probability of classifying the overlaps as one of the audio events that the overlaps contain. Experiments show that the proposed algorithm can not only improve the classification accuracy of overlapped audio events, but also the classification accuracy of isolated audio events, thus it can improve the overall classification performance, and then can be taken as an efficient scheme for audio event classification of complex audio documents.
Keywords/Search Tags:audio detection, audio classification, active learning, semi-supervised learning, temporal overlaps
PDF Full Text Request
Related items