Font Size: a A A

Sound Event Recognition Based On Frequency Band Decomposition

Posted on:2019-08-15Degree:MasterType:Thesis
Country:ChinaCandidate:H K HuangFull Text:PDF
GTID:2428330575950870Subject:Information security
Abstract/Summary:PDF Full Text Request
Sound event recognition has its great significance in fields of audio data retrieval,acoustics monitoring,medical care,environment sound recognition,audio forensics,and monitoring on abnormal event,etc.However,there exists varied background noises in real environment,and has a certain effect on sound event recognition.To address this issue,this paper makes sound event recognition under low SNR as the point of research,and proposes method of sound event recognition:Bark scale wavelet packet decomposition coefficient reconstructed spectral projection(BSP)feature and textural feature of convolution neural network(CNN)combined with RF classifier.The main part of this paper is as follows:1)Analyzing spectrogram of sound event.This paper converts time domain into time-frequency spectrogram by short-time Fourier transform(STFT)in order to analyze sound event.Sound signal is transformed into corresponding gammatone spectrogram by gammatone filter.Gammatone spectrogram emphasizes low-frequency spectrogram and ignores the high-frequency relatively.This can better reflect the frequency distribution of sound event.2)Feature extraction of Bark scale wavelet packet decomposition coefficient reconstructed spectral projection(BSP).First of all,sound event is enhanced by using short time spectral estimation.Then,according to auditory perception of the human ear,Bark scale wavelet packet decomposition is employed on sound events and the decomposition coefficients reconstruct the signal and generate the spectrogram.Lastly,projection coefficient is extracted as feature from frequency spectrogram,that is to say,BSP feature of each Bark-frequency group.3)Using convolutional neural networks to extract spectral texture features.Using gammatone spectrogram and autonomous learning ability of convolutional neural networks to extract the features of sound events.First,gammatone spectrogram generated by the sound event is moved through the window to obtain the fragment sample set.Then,optimal weight parameters are obtained by adjusting the forward and backward propagation errors when convolutional neural networks model trains gammatone spectrogram fragment samples set.Finally,cutting down output of full-connection layer as features of sound event.4)Classification and recognition.Due to the characteristics of training speed and with strong generalization ability,random forests are used to classify the extracted features.Multi-subsets of feature is generated by using Bootstrap method on the feature set of trained samples.Then,according to the construction rule of decision tree,every subset of feature is used to construct the corresponding decision tree and make up random forests.Lastly,the prediction result of tested sound event is got by voting of those decision trees.Experiment studies forty sub-classes of animal sound event,each sub-classes sound event is mixed with various noises and various SNRs in order to simulate the real situation.The experiment results show that the proposed method achieves a better result to recognize most of the sound events,and is suitable for sound events recognition under low SNR environment.
Keywords/Search Tags:sound event recognition, spectrogram analysis, Bark wavelet packet decomposition, convolutional neural networks, random forests
PDF Full Text Request
Related items