Font Size: a A A

Audio Event Detection Based On Diversity Information Understanding

Posted on:2023-04-30Degree:MasterType:Thesis
Country:ChinaCandidate:J Y FangFull Text:PDF
GTID:2568306914482034Subject:Information and Communication Engineering
Abstract/Summary:
The research of audio event detection is one of the current research hotspots.Deep learning technology has brought a great breakthrough to the development of audio event detection research.How to make the deep learning method more compatible with the characteristics of audio events is one of the research difficulties.In order to improve the architecture of deep learning network in the research of audio event detection,starting from the two key problems of diversity and multi label,this paper proposes a method and architecture to strengthen the understanding of audio event diversity in the network architecture.In order to solve the above two key problems,the main contents of this study mainly include the following aspects:(1)A frequency domain information extraction method based on implicit attribute token is proposed.Frequency domain token changes the weight of different frequency bands on the classification results by weighting different frequency bands.This approach achieved 1%and 0.6%improvement in the indicators of micro auprc and F1 respectively in task 5 of dcase2020.(2)An audio event pooling method based on fcanet is improved.The frequency spectrum of audio events shows different diversity in frequency domain dimension and time domain dimension.The pooling method of fcanet can effectively solve the problem of structural diversity of audio events in the spectrum.In this paper,the parameters of fcanet suitable for audio event detection are obtained through a series of experiments.The improved fcanet is applied to the network to improve the final classification results.(3)A multi label relationship modeling method based on pre classification is proposed.The first output tag obtained through network training is re input into the network through a certain processing method,so as to introduce the relationship between tags into the network when using weak tag data set.In task 5 of dcase2020,the indicators of micro auprc and F1 were improved by 1.3%and 0.7%respectively.
Keywords/Search Tags:audio event detection, attention mechanism, structured, diversity problem, multi-label problem
Related items