Font Size: a A A

Research On Urban Environmental Sound Recognition Based On Deep Convolutional Neural Network

Posted on:2022-05-31Degree:MasterType:Thesis
Country:ChinaCandidate:T W LuFull Text:PDF
GTID:2512306722490824Subject:Electronic Science and Technology
Abstract/Summary:PDF Full Text Request
Sound,as an important medium of information dissemination,is an important way for human beings to perceive their surroundings.With the rapid development of urban informatization,the task of urban environmental sound recognition has gradually become a hot spot in the field of sound signal processing.Environmental sound recognition(ESR)technology can be applied to many scenes,such as security monitoring,ecosystem survey,smart city and so on.In recent years,with the development of deep learning,the algorithm based on deep neural network has made remarkable progress in ESR task.However,the lack of sufficient data sets limits the performance of deep learning model in ESR task.In order to alleviate the influence of the above problems,this dissertation focus on deep convolution neural network.Firstly,the residual network(Res Net50)pre-trained by large image data sets,is selected as the baseline system to the task of ESR.The experimental results show that the recognition performance of Res Net50 network is better than that of the classical Alex Net and VGG16 network models.On this basis,the recognition model based on capsule network is constructed.The Res Net50 network pre-trained on large image data sets is used as feature extractor,and the capsule network is used to model sound features from multiple angles.It is shown that the migration feature can reduce the number of hidden layers that need training in the environmental acoustic recognition network,and alleviate the over-fitting problem caused by the small scale of data sets.The recognition performance and convergence speed of the model are all improved by the capsule network.In addition,due to the lack of large labeled data sets in the ESR task,the basic DCNN model is improved,the convolution kernel size of the first convolution layer is changed to 1×1,and some hyperparameters of the model are adaptively selected by Bayesian optimization method.Experimental results show that using multiple convolution kernels with the size of 1×1,effectively improves the dimension of the input feature graph,and the Bayesian optimization algorithm makes the model can adaptively select the appropriate parameter values,thus improving the recognition performance of the model.Finally,aiming at the application scenarios of automobile whistle recognition on traffic roads,this paper constructs an improved automobile whistle recognition system.This system is based on the improved algorithm in front of this paper.Firstly,the sub-band spectral entropy parameters are used to pre-select the segments which are likely to be automobile horns,and then identify them.The car whistle recognition system is verified on the recorded road sound data set,and the display accuracy is obviously improved,reaching 98.5%.
Keywords/Search Tags:Urban environmental sound recognition, Transfer learning, DCNN, Whistle detection, Sub-band entropy
PDF Full Text Request
Related items