Research On Acoustic Scene Classification Using Deep Learning

Posted on:2022-02-22

Degree:Master

Type:Thesis

Country:China

Candidate:G J Qiao

Full Text:PDF

GTID:2518306476490734

Subject:Communication and Information System

Abstract/Summary:

PDF Full Text Request

Acoustic Scene Classification(ASC)is a method to associate audio with its recorded scene,and is one of the important topics of computer auditory scene analysis.Acoustic scene classification is mainly by extracting the features of the audio signal and classifying the analyzed features into corresponding scenes.The current acoustic scene classification system mainly consists of audio feature extraction and classifier.The extracted audio features mainly include Mel Frequency Cepstral Coefficients(MFCC)and Log-Mel spectrogram,the classifier mainly includes recurrent neural network,convolutional neural network and deep neural network.Researchers improve model performance by improving single model,multi-model integration,and transfer learning.In the case of poor quality of video information,the use of audio analysis to assist the work of the video classification system will make a certain contribution to the development of autonomous driving and smart cities.In order to solve the problem of low accuracy of acoustic scene classification,this thesis is mainly to improve the single model performance.The research is carried out from three aspects,based on the Log-Mel spectrogram,by changing the number of filters and using different channels of audio and Harmonic Percussive Source Separation(HPSS)enhancement method to extract audio features;on the basis of convolutional neural network as a classifier,by adding a Squeeze Excitation(SE)module will be able to pay attention to information between feature channels,and innovatively uses SE to extract information between different frequencies;based on the classic convolutional neural network structure Visual Geometry Group(VGG)and the basic structural units in Inception,one Inception structural unit and two VGG basic structural units form a hybrid network as a classifier.On the dataset of the 2019 challenge on Detection and Classification of Acoustic Scene and Events(DCASE),experiments show that features extracted with the appropriate number of filters and using HPSS can improve the accuracy of audio scene classification;channel-based squeeze excitation block can improve the classification performance,the frequency-based squeeze excitation module has improved the classification effect in some scenes;the classification effect of the hybrid network-based model performs better in the classification accuracy of some scenes.

Keywords/Search Tags:

Acoustic scene classification, Convolutional neural network, Squeeze excitation, Harmonic percussive source separation

PDF Full Text Request

Related items

1	Design And Implementation Of A Content-based Music Genre Auto-classification System
2	Research On Acoustic Scene Classification Based On Convolutional Neural Network
3	Acoustic Scene Classification Method Based On Convolutional Neural Network
4	Research On Object Detection Method By Parallel Connecting Deep-shallow Layers With Squeeze-and-excitation
5	Acoustic Scene Classification Based On Hybrid Convolutional Neural Network
6	Research On The Separartion Algorithm Of Music Instruments And Singing Vioce
7	Research On Chinese News Text Classification Method Based On CNN Mixed Model
8	Research On Feature Extraction And Recognition Of Sound Event
9	Research And Implementation Of Speaker Recognition Based On Deep Learning
10	Research On Acoustic Scene Classfication Method Based On Subspectrogram