Font Size: a A A

Research On Deep Network Model Based On Sound Event Location And Detection

Posted on:2022-05-18Degree:MasterType:Thesis
Country:ChinaCandidate:X Y HanFull Text:PDF
GTID:2518306536496334Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
In recent years,sound event location and detection(SELD)has been widely used in various fields.For example,in flammable areas,through the location and detection of the sound of flame burning,timely alarms can be made to control the fire in the bud;in the earthquake area,the trapped people can be rescued by finding and locating the sounds of people calling for help.In view of this fact,it is reasonable to combine sound event detection with localization,which can not only identify the type and time position of the sound,but also estimate its spatial position.This paper uses deep neural network methods to explore the effectiveness of sound event localization and detection models.First,this article builds a gated convolutional recurrent neural network model.This model solves the tasks of sound event detection(SED)and direction of arrival(DOA)estimation.Compared with the DCASE 2019 baseline system,the gated convolutional neural network model can not only locate and detect different environments At the same time,the accuracy of sound event detection and sound arrival direction estimation is improved,which is significantly better than the baseline method.Secondly,this paper builds a temporal convolutional neural network model based on residual 2.The residual 2 module explores the multi-scale expression ability of a more finegrained level by continuously increasing the receptive field of the neural network.In this module,squeeze,excitation,and re-weighting operations are added to fuse the spatial dimensions between feature channels.The introduction of time convolution module simplifies the network structure and speeds up training.Tested on the data set of DCASE2020 Challenge Task 3,the results show that the network framework performs better in the performance of the sound event localization and detection system.Finally,this article builds a Ghost convolutional time-frequency segmentation attention network model.Among them,compared with ordinary convolution,Ghost convolution uses fewer parameters and saves computing resources.The time-frequency segmentation network obtains sound events in the time-frequency domain by training sound fragments,so that subsequent models can better identify,enhance and segment sound events.The selfattention mechanism is added to improve the attention to the sound characteristics of SELD.For DCASE 2020 task 3 SELD task,compared with the baseline system and other teams in the challenge,this model has great practicability.
Keywords/Search Tags:Sound event localization and detection, Gated convolutional recurrent neural network, Time convolutional network, Ghost convolution, Time-frequency segmentation network
PDF Full Text Request
Related items