Sound Event Detection Using Attention Mechanism And Interactive Annotation

Posted on:2023-11-08

Degree:Master

Type:Thesis

Country:China

Candidate:Y Yao

Full Text:PDF

GTID:2568306836463244

Subject:Information and Communication Engineering

Abstract/Summary:

PDF Full Text Request

Sound Event Detection research is to detect occurred sound events and corresponding timestamps.Since ambient sounds tend to overlap with each other,as well as the complex noise environment which makes detection even more tricky.Detection is often treated as multilabel classification problem,it is overlapping when multiple categories of sounds occurred in one frame.The time stamps are then obtained as a certain class of sound event been continuously classify in multi-frame.Consequently,the sound event detection problem is a continuous multi-label classification problem.As with classification problems,neural networks have the ability to extract patterns from audio data,which becomes the mainstream method in academia.Among various neural network models,attention mechanism has been focused and successfully used in Natural Language Processing and Sound Event Detection.It enables a better decision by weighted-sum audio frames.This study is based on the attention mechanism,and explore to what extend that the attention mechanism can help to improve Sound Event Detection.Specifically,the contributions are:1.Since ambient sound are lack of inherent grammatical and semantic structures,it is memory wasted to computed attention by including more frames with long time interval.Therefore,in view of the lack of memory-controlled mechanism in traditional attention-based Sound event detection,this study proposes to used memory-controlled model.Evaluation is performed in datasets of two scenario which prove the effectiveness of this method.2.The selection of attention span of different datasets is heuristic.This study proposes an adaptive mechanism that can learn its optimal memory span.Experimental results show that this mechanism achieves the similar level result of artificial optimization.3.During training a large amounts of audio date is used including synthetic strongly labeled,real-life weakly labeled and unlabeled recording.Using Multiple Instance Learning to leverage weakly and strongly labeled data,this study evaluates a set of pooling methods in two scenarios.According to the experiment,the advantages of attention pooling are not brought into full play in DCASE Challenge 2021 task 4,so the feature level attention pooling using a larger embedding space was proposed.The experiment claims that even a small embedding space can improve the detection in all metrics.

Keywords/Search Tags:

Overlapping Sound Event Detection, Memory-controlled Attention Mechanism, Multiple Instance Learning, Attention Pooling, Neural Network

PDF Full Text Request

Related items

1	Research On Sound Event Detection Technology In Domestic Environment
2	Research On Sound Event Detection Based On Deep Learning
3	Research On Multi-sound Event Localization And Detection Method Based On Deep Learning
4	Research On Sound Event Detection Technology Based On Neural Network
5	Multiple Instance Learning Algorithms Based On Deep Learning
6	The Research Of Sound Event Classification And Detection On Semi-supervised Learning Method
7	Polyphonic Sound Event Detection Using Feature Space Attention And Temporal-frequency Attention
8	Research On Sound Event Detection Based On Weakly Supervised Learning
9	Research On Event Detection Based On Self-attention Neural Network
10	Research On Deep Multiple-instance Algorithms Based On Channel And Spatial Attention