Font Size: a A A

Research On Acoustic Event Detection Algorithm Based On Time-Frequency Feature Information

Posted on:2020-11-03Degree:MasterType:Thesis
Country:ChinaCandidate:R X YangFull Text:PDF
GTID:2428330590996430Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
In recent years,digital multimedia technology has developed rapidly.Sound is the main way of human communication,and it is filled with people's daily life in the form of audio.People usually need to distinguish the sounds of events happening around to judge where they are,so acoustic event detection technology came into being and received the attention of researchers.For the audio data collected by a recording device,how to judge the time of the acoustic event and what kind of the event happened specifically has become the key point of research on acoustic event detection technology.In this thesis,we focus on studying acoustic events technology.The main contributions are listed as follows:(1)Aiming at the problem that the endpoint detection technology in the field of acoustic event detection is not mature enough,an endpoint detection algorithm based on MFCC cepstrum distance and short-term energy distance is proposed.The time domain features of audio signal can directly describe the waveform of audio signal,and the frequency domain features can reflect the detailed information of the signal itself.Combining with the description of audio signal in time domain and frequency domain,we calculate the distance of MFCC(Mel Frequency Cepstral Coefficients)and short-term energy between each frame and the average value of noise are calculated respectively.We set threshold values for MFCC cepstrum distance and short-term energy distance respectively,and integrate the experimental results of the two characteristic distances to get detection result.Experimental results show that the algorithm can solve the problem that the sound feature information is insufficient and the noise segment boundary is not clear.Compared with the classical double threshold method,the Fscore can be increased by 0.245 at low SNR.Compared with reference [26],the F score can be increased by 0.2 while reducing the computational complexity.(2)Considering the diversity of acoustic sources and the instability of its propagation,a feature extraction algorithm based on EMD(Empirical Model Decomposition)and GFCC(Gammatone Frequency Cepstral Coefficients)is proposed.The audio signal is decomposed into several IMF(Intrinsic Mode Function)after EMD decomposition,and the main components are selected by calculating the correlation coefficient.Then,the GFCC of these IMF components is extracted as the feature coefficients to generate EGFCC(EMD-GFCC),which can reduce the loss of audio data and retain more details in feature extraction.The GMM(Gaussian Mixtrue Model)is selected as the classifier because it generates a unique model for each type of acoustic event and can achieve fine effect by setting a few parameters.Experiments show that EGFCC is close to the frequency selection characteristics of human ears,and the combination of EGFCC.Compared with reference [23],the recognition rate is improved by 4.45%.Compared with reference [24],the computational complexity is reduced and the anti-noise performance is improved in the case of similar comprehensive performance in low SNR;compared with reference [26],the computational complexity is reduced and the F score is increased by 0.187.
Keywords/Search Tags:acoustic events detection, acoustic events classification, endpoint detection, feature extraction, Gammatone filter, empirical mode decomposition
PDF Full Text Request
Related items