Rare Sound Detection Based On Multi-scale Neural Network

Posted on:2022-05-07

Degree:Master

Type:Thesis

Country:China

Candidate:Y T Zhou

Full Text:PDF

GTID:2568307049960139

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

Rare sound event detection is of great importance in applications of smart home,health monitoring,acoustic monitoring and audio monitoring.Some recent rare sound event detection methods make use of integrating multiple models,resulting in high traing cost because of complex model strcture and a large number of parameters.To solve these problems,this paper proposes a rare sound event detection model based on multi-scale depthwise separable convolution and recurrent neural network that combines the methods of multi-scale time-frequency convolution and multi-scale feature fusion.The main works of this paper are as follows:(1)Proposing a feature extraction method of multi-scale time-frequency convolution.In order to reduce the number of network parameters and extract effective multi-scale time-frequency feature maps,it is proposed to perform convolution operation on the input spectrogram features with continuous one-dimensional frequency domain convolution and one-dimensional time domain convolution instead of ordinary convolution Two-dimensional convolution,in view of the one-dimensional timing characteristics of audio signals.(2)Constructing a multi-scale depthwise separable convolution and recurrent neural network rare sound event detection model.First,we improve the depthwise separable convolution module with the multi-scale time-frequency convolution method.Second,the original two-dimensional convolution module in CNN is replaced by the improved multi-scale depthwise separable convolution module to extract the highdimensional abstract features of the input Mel energy spectrum.Third,through the Gated Recurrent Unit which is a Variant of RNN to model the temporal context.Finally,the classification result,the prediction matrix from the fully connected layer output,is obtained.(3)Introducing multi-scale feature fusion.In order to capture sound features with more information and improve the performance of the model,depthwise separable convolution modules of different scales are used to extract sound feature maps of different scales,and these sound feature maps are concated to form multi-scale fusion features.The proposed method was tested on the DCASE 2017 rare sound detection dataset.The results show that the ER and F values of the proposed method are increased by 0.43 and 24.3% respectively compared with the official baseline method.

Keywords/Search Tags:

Rare sound detection, Multi-scale time-frequency convolution, Depthwise separable convolution, Recurrent neural network, Multi-scale feature fusion

PDF Full Text Request

Related items

1	Research On Real-time Face Detection Model Based On Multi-layer Feature Fusion
2	Research On Pedestrian Anomaly Behavior Recognition Algorithm Based On Multi-fiber Feature Fusion
3	Research On Recognition Of Sound Events Based On Multi-scale And Multi-level Feature Analysis
4	Efficient And Lightweight Feature Pyramid Network For Object Detection
5	The Research Of Object Detection Model Based On Convolution Neural Network
6	Research On Real-Time Semantic Segmentation Method Based On Feature Fusion
7	Research On Multi-scale Face Detection Based On Convolution Neural Networks
8	Research On Multi-sound Event Localization And Detection Method Based On Deep Learning
9	Real-Time Multi-Persons Pose Estimation In Complex Scenes
10	Research On Super-Resolution Image Reconstruction Based On Deep Convolution Neural Network