Font Size: a A A

Environment Sound Recognition Based On Deep Learning Methods

Posted on:2019-01-18Degree:MasterType:Thesis
Country:ChinaCandidate:J H LiFull Text:PDF
GTID:2428330575950158Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Environmental sound recognition usually collects and identifies audio data in the environment,which achieve the awareness of environmen.It was plays an important role in audio forensics,location-tracking,sound event detection and scene recognition.In this paper,we proposed sound enhancement algorithm based on Stack Denoise Auto-encoder(SDA)to resolves the problems that the animal's sound event recognition in natural environment is disturbed by non-stationary noise signals,and proposed the methods combined Convolutional Neural Networks(CNN)and Random Forest(RF)to resolves the recognition of acoustic scene.The main work includes the followings:(1)Sound enhancement algorithm based on SDA.In natural environment,sound event recognition is disturbed by non-stationary noise signals and it can not effectively filter out noise feature components in tradition way.In this paper,we proposed the algorithm based on the data augmentation which artificially mix the pure sound and noise in different signal-to-noise ratio(SNR)and the SDA model which has multi-layer convolution structure,and using this combined methods to filter out noise feature components.Firstly,mix the pure sound and noise in different SNR,using this mixed signal to create gammatone spectrogram.Secondly,using above spectrogram as the input of SDA,and make SDA supervised learning gammatone spectrogram of pure sound signal as itself output.(2)Feature extraction of acoustic scenes.In this paper,we proposed a methods which using Mel energy spectrum and CNN's learning ability of the high-dimensional spectral features.After the training of CNN,extract the output of full-connection layer so as to find the CNN feature and use CNN feature as itself middle features for other classifier.Firstly,using Short-time Fourier transform of acoustic scene to generate short-term power spectrum and Mel filter banks to get Mel power spectrum.Secondly,it creates new dataset with the slice of Mel energy spectrum by shift the time window.Thirdly,the new dataset will take two-stage training towards the CNN model and extract the output of full-connection layer so as to find the CNN feature which is the feature representation of acoustic scenes.(3)The classification and recognition of acoustic scenes.Softmax is the classifier of the traditional CNN model structure,however,it was not effective in Anti-noise and easy to overfit.We proposed the methods which using RF classifier to identify the output of full-connection layer in CNN to resolves above problems of softmax.Firstly,CNN loads pre-trained weights and extract the output of full-connection layer of CNN as training set of RF.Secondly,using CNN feature build the decision trees and form RF.In test stage,the vote result of RF as the methods prediction.In this paper,we proposed sound enhancement algorithm based on SDA which applied in animal sound recognition in natural environment and the Mel-CNN-RF recognition methods which applied in scene recognition of DCASE2016.In the related experimental results and analysis,it shows that the deep learning method used in this paper is effective in environmental sound recognition.
Keywords/Search Tags:Sound enhancement algorithm, Stack Denoise Auto-encoder, acoustic scene recognition, Convolutional Neural Networks, Random Forest
PDF Full Text Request
Related items