Font Size: a A A

Rare Sound Detection Based On Multi-scale Neural Network

Posted on:2022-05-07Degree:MasterType:Thesis
Country:ChinaCandidate:Y T ZhouFull Text:PDF
GTID:2568307049960139Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Rare sound event detection is of great importance in applications of smart home,health monitoring,acoustic monitoring and audio monitoring.Some recent rare sound event detection methods make use of integrating multiple models,resulting in high traing cost because of complex model strcture and a large number of parameters.To solve these problems,this paper proposes a rare sound event detection model based on multi-scale depthwise separable convolution and recurrent neural network that combines the methods of multi-scale time-frequency convolution and multi-scale feature fusion.The main works of this paper are as follows:(1)Proposing a feature extraction method of multi-scale time-frequency convolution.In order to reduce the number of network parameters and extract effective multi-scale time-frequency feature maps,it is proposed to perform convolution operation on the input spectrogram features with continuous one-dimensional frequency domain convolution and one-dimensional time domain convolution instead of ordinary convolution Two-dimensional convolution,in view of the one-dimensional timing characteristics of audio signals.(2)Constructing a multi-scale depthwise separable convolution and recurrent neural network rare sound event detection model.First,we improve the depthwise separable convolution module with the multi-scale time-frequency convolution method.Second,the original two-dimensional convolution module in CNN is replaced by the improved multi-scale depthwise separable convolution module to extract the highdimensional abstract features of the input Mel energy spectrum.Third,through the Gated Recurrent Unit which is a Variant of RNN to model the temporal context.Finally,the classification result,the prediction matrix from the fully connected layer output,is obtained.(3)Introducing multi-scale feature fusion.In order to capture sound features with more information and improve the performance of the model,depthwise separable convolution modules of different scales are used to extract sound feature maps of different scales,and these sound feature maps are concated to form multi-scale fusion features.The proposed method was tested on the DCASE 2017 rare sound detection dataset.The results show that the ER and F values of the proposed method are increased by 0.43 and 24.3% respectively compared with the official baseline method.
Keywords/Search Tags:Rare sound detection, Multi-scale time-frequency convolution, Depthwise separable convolution, Recurrent neural network, Multi-scale feature fusion
PDF Full Text Request
Related items