Font Size: a A A

Research On Classification Of Acoustic Scenes

Posted on:2023-11-20Degree:MasterType:Thesis
Country:ChinaCandidate:J Q ZhaoFull Text:PDF
GTID:2568306818995369Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Acoustic scene classification task is a branch of computer applied science,aiming to locate and analyze the environment of the device.It becomes increasingly important with the popularization of intelligent devices and the development of the Internet.Early acoustic scene classification research is based on traditional algorithms and machine learning methods.With the development of computer hardware and the era of big data,deep learning replaces traditional machine learning methods and becomes mainstream.However,in the aspect of deep learning at present,it adopts Mel Spectrogram commonly used in speech tasks in feature processing and computer vision in network structure,without new feature processing methods and network structure according to the particularity of acoustic scene.Meanwhile,it is collected by the different devices and different device data volume differences,with a serious device and sample number mismatch.In this regard,this thesis studies acoustic scene classification from multiple perspectives,mainly including the following three aspects:(1)In order to solve the problem of modeling the relationship between different semantic features in acoustic scene classification,an end-to-end semantic information convolutional neural network is proposed,which cuts and splices 2D Mel Spectrogram into spectrum stream in the time dimension,also can model the semantic relationship information in acoustic scene when extracting high-dimensional features.Meanwhile,in order to improve the generalization performance,this thesis proposes a new data augmentation method,which can effectively reduce the influence of noise points on data by exchanging different semantic positions,so as to enhance the representational capacity of the network.(2)To solve the problem that convolutional neural networks can not effectively use timefrequency information to model the relationship between different channels in acoustic scene classification,a multi-dimensional convolutional neural network model is proposed.After obtaining the high-dimensional feature map of input features through 2D convolution,it uses stretching operation to reduce the dimension of the high-dimensional feature map in time and frequency dimensions.Then,the relationship between different channels is modeled by 1D convolution and the channel attention matrix is obtained.Finally,the 2D convolution feature map is combined with the channel attention matrix,and the residual connection is performed to make the network more effective in the time-frequency information of the Mel Spectrogram.(3)In order to solve the problem of equipment mismatch in acoustic scene classification,a domain adaption method based on feature alignment is proposed.Firstly,this thesis designs a two-stream network containing source domain flow and target domain flow.In each flow,two sub-networks are used to process the Mel Spectrogram feature and deltas feature of acoustic scene,so as to obtain the feature maps of the source domain and target domain for feature alignment.According to the feature map of the source domain and target domain,it carries out the downsampling classification to test the effects of different loss functions in feature alignment.Finally,this thesis proposes an alternate training strategy to deal with the data imbalance between paired and unpaired samples,which effectively solves the problem of dataset shift during training.Above all,this thesis studies the acoustic scene classification methods from three perspectives of features,network structure and domain adaptation,then conducts multiple comparative experiments with different datasets from different perspectives to prove the effectiveness and superiority of the method.
Keywords/Search Tags:acoustic scene classification, convolutional neural network, spectrogram stream, multi-dimensional convolution, domain adaptation
PDF Full Text Request
Related items