Font Size: a A A

Sound Recognition Based On Energy Detection For Complex Environments

Posted on:2015-12-10Degree:MasterType:Thesis
Country:ChinaCandidate:X X ZhangFull Text:PDF
GTID:2308330461471474Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
With the continuous development of modern society and economy, more and more attention is paying to the ecological environment. Environmental sound which contains a large amount of rich information is one indispensable element of ecological environment. Therefore, the analysis and recognition of various environmental sounds has great potential significance on the protection and sustainable development of ecological environment. In real ecological environment, noise exists everywhere and is inevitable. This paper makes the bird sound recognition under complex background noise environment as the breakthrough point of the research, and proposes an environmental sound recognition approach based on Mel-scaled Wavelet packet decomposition Sub-band Cepstral Coefficient (MWSCC) feature through adaptive energy detection (AED) combined with double-layer mixed classification model consisting of Gaussian Mixture Model (GMM) and Support Vector Machine (SVM). Furthermore, this approach is generalized to environmental sound recognition under complex background noises. The main research work of this paper includes the following three aspects:1) Adaptive energy detection:to solve the two difficult problems including the priori knowledge of noise variance and the setting of fixed detection threshold of energy detection commonly used in signal detection field, the method of adaptive energy detection is proposed. According to the distribution characteristics of frequencies, the sound with noises is firstly divided into several sub-bands, and non-stationary noise power spectrum is estimated from each sub-band. Then, the presence probability of the foreground sound in noise estimation is used to set the probability of detection, which is further to derived the the targeted energy detection threshold. Finally, based on the estimation of noise variance and adaptive detection threshold, the adaptive decision rule of energy detection is constructed to detect the sound of interest.2) Improved feature extraction:for the shortcoming of Mel-frequency cepstrum coefficient (MFCC) which performs worse under complex background noise environment and the characteristics of environmental sound which is various, diverse, non-stationary and unstructured, the Mel-scaled Wavelet packet decomposition Sub-band Cepstral Coefficient (MWSCC) is proposed. The improved feature is combined with front-end adaptive energy detection (AED) to obtain AED_MWSCC, which means only extracted the MWSCC feature for the useful sound determined by AED. It not only optimizes the recognition performance but also reduces the time complexity.3) Double-layer mixed classification model:taking advantage of the characteristics that GMM is suitable for processing continuous sound signal and SVM is suitable for classifying sound, the double-layer mixed classification model of GMM and SVM is proposed. The first layer GMM describes the characteristics of distribution and performs coarse classification. And the GMM probability output of the first layer is made as the input of the second layer SVM. Then the second SVM performs fine classification of environmental sound. Thus, the double-layer mixed classification model of GMM_SVM is constructed to classify and recognize the complex environmental sound.In this paper, thirty kinds of three categories complex environmental sound including birds, mammals and insects are utilized to conduct research and contrast experiments. And the experimental results show that the proposed AED_MWSCC combined with the double-layer mixed model of GMM_SVM approach has good noise robustness and classification recognition performance, and is suitable for environmental sound recognition under complex background noises.
Keywords/Search Tags:adaptive energy detection, non-stationary noise power Spectrum estimation, Mel-scaled Wavelet packet decomposition Sub-band Cepstral Coefficient (MWSCC), Gaussian Mixture Model(GMM), Support Vector Machine(SVM), double-layer classification model
PDF Full Text Request
Related items