Font Size: a A A

Content Analysis For Natural Acoustic Scene Based On Deep Neural Network

Posted on:2022-07-04Degree:MasterType:Thesis
Country:ChinaCandidate:M L LiuFull Text:PDF
GTID:2518306569972709Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
In today's era of mobile Internet and big data,people can easily obtain massive audio data recorded in natural acoustic scenes.How to effectively analyze and browse these audio data has become one of the research hotspots in the field of natural acoustic scene analysis.In this thesis,we take the audio of natural acoustic scene as the processing object,investigate the problem of acoustic scene clustering and event detection,and propose effective solutions.The main work and innovation of this thesis are as follows.At present,most of researchers made significant efforts on supervised acoustic scene classi-fication,while unsupervised acoustic scene clustering is rarely studied.In this thesis,method of acoustic scene clustering based on the collaborative optimization of both deep feature learning and clustering iteration is proposed.First,log Mel spectrum(LMS)features of audio samples are extracted,and convolutional neural network(CNN)is initialized for extracting deep fea-tures.Then,agglomerative hierarchical clustering(AHC)algorithm is used to merge the most similar two classes.The CNN parameters are updated according to the loss function designed in this thesis.The processes of both feature extraction and clustering iteration are carried out alter-nately until the convergence condition is satisfied.Two mainstream audio dataset are used for evaluation,and normalized mutual information(NMI)and clustering accuracy(CA)are used as performance metrics.The experimental results show that the proposed method is superior to traditional clustering algorithms.In addition,the proposed deep feature is superior to other features,and the proposed method has good robustness.How to improve the performance of acoustic event detection without increasing the com-plexity of deep neural network is one of the hot issues in natural acoustic scene analysis,and is also another work of this thesis.This thesis proposes,a method of acoustic event detection based on dilated convolutional recurrent neural network(DCRNN).First,the LMS features of each audio sample is extracted,and the DCRNN is constructed.Afterwards,the constructed DCRNN is used to determine the acoustic event type of each audio frame of the test sample.Three main experimental datasets(Synthetic 2016,TUT Sound Event 2016 and TUT Sound Event 2017)are used for evaluation,and F1 score and error rate are used as performance metric.The experimental results show that: compared with the baseline method,the proposed method achieves better detection performance on the above three datasets,and the parameter size of network does not increase.In conclusion,this thesis proposes methods of acoustic scene clustering and event detec-tion based on deep neural network,and carries out experimental evaluation and analysis from multiple perspectives to prove the effectiveness of the proposed methods.
Keywords/Search Tags:Deep feature, Deep neural network, Dilated convolution, Acoustic scene clustering, Sound event detection
PDF Full Text Request
Related items