Font Size: a A A

Research On Audio Event Detection And Classification Based On Surprise Model And Spectrogram

Posted on:2016-11-08Degree:MasterType:Thesis
Country:ChinaCandidate:X X ChengFull Text:PDF
GTID:2308330473957038Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
Audio information, as an important component of multimedia information, enriches people’s daily life, is becoming more and more important in human’s daily life. However, the application of audio information largely depends on audio detection and recognition technology. Therefore, audio detection and recognition technology has important social significance and practical value. In this paper, the research of audio events are defined as a part of audio information that relatively significant, and easy to be attention by people. We call these audio events are "recognition" audio event or abnormal sound fragment. Endpoint detection of audio event mostly use the methods of short-time energy and short-time zero crossing rate, and feature extraction of audio event mostly follow the traditional methods of speech signal processing such as Mel-Frequency CepstrumCoefficient (MFCC), Linear PredictionCepstrum Coefficient (LPCC) and so on. Because of audio events include both the speech signals and the non-speech signals. Therefore, the traditional endpoint detection and feature extraction methods based on speech processing is obviously insufficient. Based on the above analysis, the thesis proposed the method of endpoint detection based on Bayesian surprise model and the method of feature extraction based on spectrogram characteristics, and finally using the Sparse Representation Classification (SRC) method for classification.The main content of work and innovation points are described as follows:(1) Endpoint detection of "salient" audio event. Because of "salient" audio events have a certain degree of significant in the audio signal. The paper takes advantage of Teager energy operator and energy separation algorithm (ESA) to extract significance feature of audio signal based on AM-FM model. Then through the analysis of Bayesian surprise model obtain the audio surprise curve. Finally locate and extract the "salient" sound fragment.(2) Extracted features of "salient" audio events based on spectrogram. Based on the analysis of the spectrogram of "salient" audio events, we assert that it has significant difference in direction and subtle in the time-frequency structure of abnormal sound spectrogram. It reveals the nature of abnormal sound signals, and has good discrimination. Therefore, it can extract effective features for the sound classification and recognition based on the spectrogram of abnormal sound fragments. We first extract abnormal sounds and convert them into spectrogram, and then, in order to obtain a better represent in the time-frequency structure characteristics of abnormal sound spectrogram, time-frequency structure is described based on 2D-Gabor filter. Finally, extracting features of spectrograms based on Gray Level Cooccurrence Matrix (GLCM), as the characteristic parameters of "significant" audio event for its classification and recognition.(3)This paper establishes the global sparse representation model based on abnormal sound spectrogram and sparse representation algorithm. And then, the sparse representation classification algorithm based on L1 Norm minimization is used for the representation of the indentified signal in the dictionary, and the classification and recognition of "salient" audio events based on the minimization residual method. Finally, by comparison with experiments in traditional methods, the effectiveness of the proposed method is verified.The research work and results of this paper has a certain role and reference value for "salient" audio events’detection and recognition.
Keywords/Search Tags:the detection and recognition of audio events, Bayesian surprise model, spectrogram, 2D-Gabor filter, sparse representation
PDF Full Text Request
Related items