Font Size: a A A

Research On Environmental Sound Recognition Method Based On Deep Learning

Posted on:2021-10-05Degree:MasterType:Thesis
Country:ChinaCandidate:M YaoFull Text:PDF
GTID:2518306473964579Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
Environmental sound recognition is one of the research directions in the field of acoustic signal processing.It mainly realizes the identification of environmental acoustic by analyzing the components contained in the acoustic signal.Environmental sound recognition technology can be used in various fields such as security monitoring,medical care,ecological protection,etc.,to make up for the problem of low monitoring efficiency caused by the disadvantages of video monitoring,such as the monitoring angle of view and the intensity of light.However,the environmental sound is non-stationarity and is affected by various complex background noises,which makes it difficult to identify the environmental sound.This paper studies the environmental sound recognition under the complex environmental sound.The main work and contributions are summarized as follows:Aiming at the problem that a single acoustic feature cannot effectively characterize complex environmental sound,three features(Mel frequency cepstral coefficient,logarithmic Mel spectrum and short-term energy)are used for feature fusion to improve the ability to characterize environmental sounds.By contrast experiment,it is found that different feature fusion methods have great influence on the recognition results.The experimental results show that the recognition effect of the post-fusion method based on three-input convolutional neural network for feature extraction of three different input features is better than the pre-fusion method.The main reason is that the three-input convolutional neural network model has configured different convolution and pooling operations for each input feature,which is conducive to the extraction of respective features and avoids the problem of using the same weight parameter for feature extraction for different features.The acoustic features commonly used in environmental acoustic recognition algorithms are originally designed for speech recognition and music recognition,which can well fit speech and music sounds,but these acoustic features may not be able to fully express some specific information in environmental sound.To solve this problem,a double input convolutional neural network is designed,which uses logarithmic Meier spectrum and original audio signal as input features.The direct extraction of features from the original audio signals is complementary to the logarithmic Mayer spectrum features,which improves the ability of the fusion features to represent the environmental sound.The experimental results show that this method is superior to the method of combining multiple acoustic features.Participated in the development of an environmental sound monitoring system that integrates audio signal collection,data processing and identification,and environmental sound map display.The monitoring system consists of three parts: an audio collection unit,a data processing unit,and an environmental sound map display unit.The data processing unit includes a data storage unit and an audio recognition unit.This dissertation mainly studies the audio recognition unit.Meanwhile,in order to make the system better applicable to the actual scene,the audio acquisition unit is used to collect the required environmental sound,and the data set of environmental sound is built by ourselves.Finally,the training of network models is completed.Through the actual test of the environmental acoustic monitoring system,it is found that the environmental acoustic monitoring system has high robustness and can be applied to the actual environmental acoustic monitoring.
Keywords/Search Tags:Environmental sound recognition, feature fusion, convolutional neural network, environmental sound monitoring system
PDF Full Text Request
Related items