Font Size: a A A

Research On Urban Sound Classification Based On Deep Learning

Posted on:2022-06-14Degree:MasterType:Thesis
Country:ChinaCandidate:Z L HuangFull Text:PDF
GTID:2518306527984069Subject:Mechanical engineering
Abstract/Summary:PDF Full Text Request
Audio carries a large amount of information about daily environment,life scenes and physical events in the city,Therefore,the city designers develop deep learning approach to automatically extract this information for providing corresponding utilization and service,which has huge potential and application in the construction of smart-city,such as: noise monitoring,urban security,multimedia retrieval,smart factories,etc.In this paper,urban audio classification subject is researched aiming at the problems of insufficient classification accuracy,generalization ability and noise robustness of current models.This paper is organized as follows.A novel urban audio classification model based on N-order dense convolutional network(abbreviated as N-DenseNet)is proposed for the problems of insufficient classification accuracy and generalization ability of existing models.Firstly,the network structure of DenseNet is briefly introduced.Then,dense connection in DenseNet is improved by N-order state-dependent connection based on N-order Markov model.Furthermore,combining both advantages of DenseNet and N-order Markov,a novel network architecture,i.e.,N-DenseNet,is proposed in this paper.Theoretically,N-DenseNet satisfies the premise of alleviating vanishing-gradient,which can not only make integration of feature information from the layers more efficiently,but also accelerate the convergence speed.Finally,the experimental results based on Urban Sound8 K and Dcase2016 dataset show that: the accuracy of N-DenseNet is respectively 83.27% and 81.03%,which also demonstrates a higher classification accuracy and better generalization performance of N-DenseNet.An urban sound classification model based on 2-order dense convolutional network using dual features(abbreviated as D-2-DenseNet)is proposed aiming at further improve the classification accuracy and generalization ability of current models.Theoretically,D-2-DenseNet not only can accelerate the convergence speed comparing with DenseNet,but also can enhance the classification accuracy and guarantee a good generalization ability owing to the fact that dual features fusion is exploited in the proposed model.Finally,in order to validate advantages of the D-2-DenseNet,this model is respectively exploited in the urban sound event classification based on Urban Sound8 K and Dcase2016 datasets.The experimental result shows that the accuracy of the network is respectively 84.83% and 85.17%,which has added 13.81%and 7.07% comparing with baseline,and demonstrates the excellent classification accuracy and generalization performance of D-2-DenseNet.A noise robust urban audio classification model based on 2-order dense convolutional network using dual features compensation adaptive(abbreviated as DA-2-DenseNet)is proposed aiming at the problems of insufficient noise robustness of current models.DA-2-DenseNet combines the advantages of dual features mutual compensation,2-DenseNet and an adaptive mechanism,therefore the dual feature mutual compensation adaptive algorithm can effectively extract audio information and reduce noise interference for achieving better noise robustness.Finally,DA-2-DenseNet is exploited in the urban sound event classification based on Dcase2016 datasets.Under the conditions of channel noise and environmental noise,the experimental result shows that the accuracy of the network is respectively 77.12% and 75.52%,which has added 8.51% and 10.38% compared with baseline.the noise robustness of DA-2-DenseNet is also effectively verified.Finally,in order to evaluate the feasibility and practicality of the model in the real scene,AI EAR: an intelligent classification,recognition and detection system for urban audio is constructed based on the deep learning model.The API interface of audio classification,sound detection,real-time acoustic scenes recognition,and audio/video retrieval function is developed;moreover,GUI interface based on Qt is developed to realize the man-machine interaction of the above functions.Finally,an audio classification experiment is carried out in Wuxi.The experimental results show that AI EAR system has excellent real-time capability,accuracy and efficiency in processing audio,which verifies the feasibility and practicality of the model and system.
Keywords/Search Tags:Urban sound classification, N-DenseNet, D-2-DenseNet, DA-2-DenseNet, AI EAR
PDF Full Text Request
Related items