A Study On Acoustic Scene Classification By Ensembling Multiple Deep Models

Posted on:2018-01-14

Degree:Master

Type:Thesis

Country:China

Candidate:F F Peng

Full Text:PDF

GTID:2348330533969798

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

Acoustic scene classification(ASC)is a specific task in Computational Auditory Scene Analysis(CASA)domain.It based on the audio and a specific scene semantic label to recognize the specific scene in real environment.Unlike the psychological studies devoted to understanding the mechanism of perceiving audio scene,ASC mainly relies on signal processing techniques and machine learning methods to automatically recognize audio scenes.Traditional ASC task mainly focuses on Feature Engineering and classifier selection for detecting a single scene.With the rapid development of audio collection devices,a wide variety of audio have been collected.And with a large variety of audio scene data,the traditional method has more and more difficulties in applying or acquiring better performance.In order to make full use of a large va riety of audio scene data,this paper uses various deep learning methods.Firstly,extracting Mel-Frequency Cepstral Coefficients(MFCC)feature or log-mel spectrogram feature.Then the frame features are spliced into segment features.Finally,put these s egment features into the deep learning models,such as Multi-layer Perceptron(MLP),Convolutional Neural Network(CNN),Long Short-Term Memory(LSTM).In order to improve the ASC system based on LSTM model,a segment processing technique is proposed in th is paper.The technique not only can simulate complex temporal relations,but also enlarge training data.In order to improve the ASC system based on MLP model,Attention mechanism is introduced in this method.After this process,the network can break thr ough the limitations of global data representation,and pay more attention to the key parts of the data.At the same time,the Attention mechanism can deal with the decoupling problem.That is to say,using different feature space to describe different scenes.Different kinds of deep learning methods have different recognition abilities when recognizing different scenes.For example,MLP has an advantage over beaches and residential areas,while CNN is easier to detecting libraries and buses.Usually,Ensemble learning can significantly achieve better generalization performance than a best single learner.In order to make full use of different classifiers,the Bagging Ensemble Selection(BES)is used and has received a best performance.

Keywords/Search Tags:

acoustic scene classification, deep learning, attention, ensemble methods

PDF Full Text Request

Related items

1	Research On Acoustic Scene Detection Based On Deep Learning
2	Study Of Attention-based Deep Models For Acoustic Scene Classification
3	Research On Attention Based Image Classification With Deep Learning
4	Research On Acoustic Scene Classification Using Deep Learning
5	Research On Acoustic Scene Classification
6	Feature Augmentation And Model Build For Acoustic Scene Classification With Multiple Devices
7	Visual Attention Models Based On Deep Learning For Scene Classification
8	Acoustic Scene Classification Method Based On Convolutional Neural Network
9	Acoustic Scene Classification Using Multi-Scale Deep Feature Aggregation
10	Research On Acoustic Target Recognition Methods Based On Deep Learning