Acoustic Scene Classification Via Classifiers Voting

Posted on:2021-02-16

Degree:Master

Type:Thesis

Country:China

Candidate:Shawn Sudheer Sagar

Full Text:PDF

GTID:2428330611466323

Subject:Information and Communication Engineering

Abstract/Summary:

PDF Full Text Request

Acoustic Scene Classification(ASC)is the recognition and categorization of audio data that identifies the environment which it has been recorded.ASC is quite a challenging application of machine listening due to the noisy nature of the audio signal.We analyze several state-of-the-art models for ASC on two different datasets.The datasets belong to the IEEE challenge on the Detection and Classification of Acoustic Scenes and Events(DCASE)2017 and 2019.The two datasets are publicly available for research purpose,and consist of 10 and 15 classes of acoustic scenes respectively.In total,57 hours of stereo recordings are available which includes common indoor and outdoor environmental scenes,such as beach,city center,library,forest path train,car,etc.We propose a method of ASC via fusing the voting of deep neural networks.In the proposed method,two different acoustic features are first extracted from each audio recording i.e.Melfrequency cepstral coefficients(MFCC)and Logarithmic filter-bank(LFB).These features are then fed into three different classifiers(deep neural networks): Visual Geometry Group(VGG),Residual Network(Res Net)and Long Short-Term Memory(LSTM).The motivation for choosing these variety of neural networks is that they have complementary advantages for ASC.After training each network and acquiring the results,fusion of classifiers voting is used to determine a final outcome.The fusion of all the results through voting technique is one of the methods under ensemble learning.The final Classification Accuracies(CA)that are obtained after the fusion of the classifiers are 73.27% and 76.99% on DCASE 2017 and 2019 datasets respectively.The proposed fusion of classifiers voting obtains the CA improvements by 12.71% and 13.79% on DCASE 2017 and 2019 datasets respectively,compared to the individual baseline classifiers.

Keywords/Search Tags:

ASC, DCASE, Deep Neural Network, Acoustic Features, MFCC, LFB, VGG, ResNet, LSTM, Classifier Voting

PDF Full Text Request

Related items

1	Research Of Action Recognition From Videos Using Deep Neural Networks
2	Acoustic Scene Classification Based On Hybrid Convolutional Neural Network
3	An Improved CNN-ResNet Deep Learning Neural Network And Its Applications
4	Research On Speaker Recognition Based On Deep Neural Network
5	Research On Video Face Verification Algorithms Based On Deep CNN And Local Textural Features
6	Comparative Research On The Recognition Effect Of Wake-Up Words Based On Deep Learning
7	Content Analysis For Natural Acoustic Scene Based On Deep Neural Network
8	Human Action Recognition Method Based On Neural Network And Skeletal Points Features
9	Research On Abnormal Network Traffic Detection Technology Based On ResNet-LSTM Model
10	Non-specific Human Sign Language Recognition Based On Deep Learning