Research On Sound Event Classification And Detection Method Based On Semi-supervised Learning

Posted on:2022-11-05

Degree:Master

Type:Thesis

Country:China

Candidate:Y H Liang

Full Text:PDF

GTID:2518306746968679

Subject:Information and Communication Engineering

Abstract/Summary:

PDF Full Text Request

Sound event detection(SED)is a key technology in the application fields of audio environment monitoring,smart home and intelligent assisted driving.In recent years,it is one of the research hotspots of intelligent sound signal processing.With the development of deep learning technology and the advent of the era of big data,the modeling of acoustic event detection system using deep neural network has become the focus of many researchers.It is urgent and important to establish a lightweight and intelligent sound event detection system.In recent years,although great progress has been made in sound event detection,there are still difficulties and challenges as in the following aspects:(1)How to use a small amount of weak label data and a large amount of unlabeled data for model learning when there is a heavy lack of strong label data with time stamp;(2)In most cases,there are event overlap and noise interference in the audio data that collected from complex and changeable environment.How to establish a detection system with high accuracy and robustness is a challenge;(3)With the diversity of application scenarios,more and more requirements are related to the model complexity.How to establish a lightweight sound event detection system is also one of the important problems.This paper mainly focuses on the above three difficulties,four main innovations are contributed in this study as the following:(1)To solve the strong labeled data sparse problem,we built a teacher-student semi-supervised learning framework based on convolution recurrent neural network model,making full use of strong labeled data,weak labeled data and a large number of unlabeled data to train models effectively;(2)To reduce the model complexity,we built a complex teacher model,to guide the lightweight student model training,and we performed the inference only using the lightweight student model;(3)To improve the performance of the detection system in complex scenes and the efficiency of model training,we specially proposed a deep feature distillation,adaptive focal learning,multi-stage model training strategy and post-processing techniques;(4)To alleviate the influence of event overlap and background noise interference on system modeling,we proposed to use sound separation technology to assist the modeling of sound event detection system,by using separated data and mixed data to jointly training the model.Finally,by proposing the multi-model score fusion strategy based on event discrimination,we exploit the complementary information between different models to further improve the overall performance of the system.In this paper,we use DCASE 2019 Task 4 and DCASE 2021 Task 4 datasets to perform experiments for verifying the effectiveness of the technology.The results show that deep feature distillation,adaptive focal loss learning,post-processing and other technologies have significantly improved the system performance.On DCASE 2019 Task 4 dataset,the sound event detection system based on teacher-student model structure achieves 51.3%,76.7% and 83.1% in Event-based F1-score,Segment-based F1-score and AT F1-score respectively.Compared with the first place in DCASE 2019 Task 4 evaluation,they are improved by 8.0%,19.6% and 9.0% respectively;In addition,the proposed method of using sound separation technology to assist the modeling of sound event detection system also brings great performance benefits.Compared with the baseline system,the performance of Event-based F1-socre,PSDS1 and PSDS2 are improved by 4.3%,8.6% and 20.9% respectively on DCASE 2021 Task4 dataset.In addition,the score fusion method based on class discrimination also significantly improves the performance of sound event detection system.In conclusion,this paper first proposes a series of techniques to improve the sound event detection system based on teacher-student model structure.Then,a sound event detection system based on sound separation technology is further proposed to improve the SED performances.Extensive experiments are performed to analyze the effectiveness of all the proposed methods.At the same time,the proposed technologies are also compared with other related state-of-the-art technologies in the literature.At the end of this paper,all the related works are summarized,and followed by the future research direction.

Keywords/Search Tags:

Sound event classification and detection, Semi-supervised learning, Feature distillation, Speech separation, Score fusion

PDF Full Text Request

Related items

1	The Research Of Sound Event Classification And Detection On Semi-supervised Learning Method
2	Research On Sound Event Detection Technology In Domestic Environment
3	Research On Polyphonic Sound Event Detection With Deep Neural Network
4	Research On Semi-supervised Text Classification Method Based On Deep Learning
5	Semi-Supervised PolSAR Terrain Classification Based On Mixup
6	Research On Sound Event Detection Based On Deep Learning
7	Research On Image Classification Algorithm Based On Semi-supervised Learning
8	Research Of Reliable Semi-supervised Classification
9	Research On The Application Of Semi-supervised Learning In Natural Language Processing
10	Research On Semi-supervised Clustering And Classification Algorithm