Font Size: a A A

Neighbourhood Similarity Augmentation On Multi-source Sound Event Detection And Localization

Posted on:2021-05-01Degree:MasterType:Thesis
Country:ChinaCandidate:Q HuFull Text:PDF
GTID:2518306104988109Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
Sound event detection and localization(SELD)is the research emphasis in the field of acoustics,and plays a key role in future smart devices and virtual interactive systems.Under the condition of no interference from a single sound source,the precision of SELD task based on existing research results can reach a ideal level.But in actual scenes,the acoustic signal is interfered by reverberation,noise and overlapping sound sources thus needs further improvement.This paper studies the enhancement technology of SELD task of small-scale data sets under the complex conditions and multiple sound sources.A decomposition method based on neighborhood similarity is proposed to decompose SELD task into neighborhood similar task and SELD prediction enhancement task.The neighborhood similarity task uses the continuity characteristics of sound events in the duration,and predicts four types of neighborhood similarity based on the CRNN neural network model.This paper explores the impact of different neighborhood similarity feature generation strategies and demonstrate the applicability of neighborhood similarity methods on small-scale data sets.The SELD prediction enhancement task proposes two strategies which are multi-source result enhancement and single-source model enhancement.This paper studies the actual enhancement effect of different enhancement methods and analyzes the reasons that lead to the difference.In small-scale data sets,the neighborhood similarity task confirms Short-time Fourier Transform(STFT)difference features for the highest classification accuracy,with a correlation classification accuracy of 93% and a four-types similarity classification accuracy of 96%,59%,82% and 61%,higher than the subtasks of the sound source number decomposition method under the same conditions.Compared with the benchmark model,the sound event detection(SED)error rate of the multi-source result enhancement method decreases by 4.8%,while the result of single-source model enhancement method is 3.3%,while the F score increased by 1.9 %,1.5% respectively and SELD scores both are increased by 1.1%.It is proved that the enhancement method based on theneighborhood similarity can help to improve the precision of the SELD task of multiple sound sources without changing the model,and provides another idea for the detection and location of sound events.
Keywords/Search Tags:Sound event detection, Sound event localization, Convolutional recurrent neural network, Neighborhood similarity
PDF Full Text Request
Related items