Font Size: a A A

Research On Audio Event Diversity Based On Deep Learning

Posted on:2022-03-18Degree:MasterType:Thesis
Country:ChinaCandidate:X F HongFull Text:PDF
GTID:2558306914459844Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of deep learning technology,audio event related research has made many breakthroughs,which is the current research hotspot.In order to build an efficient audio event detection system,we need to deeply explore its essential difficulties,namely the diversity of audio events.This research starts from the related theoretical research of audio event diversity,guides practice with theory,tracks the progress of cutting-edge technology and international events,applies deep learning technology to design a research framework in line with audio event diversity attributes,and discusses the spatial composition and application of audio event diversity attributes.Aiming at the diversity definition and attribute space classification of audio events,according to the diversity of audio production and convolution neural network perception principle,the diversity representation,extraction and understanding method based on deep learning is studied.The main content of this study can be summarized as the following aspects.1.The diversity of audio events is studied.(1)The diversity information extraction of audio events based on CNN is studiedAiming at the definition of diversity of audio events,a layer by layer supervised training method for diversity representation and diversity information extraction based on deep learning is studied according to the generation diversity of audio and the perception principle of convolutional neural network.The main content of the theoretical part is the exploration of diversity,and the main content of the practical part is the many to many solution of diversity,that is,multi characteristics and multi structure.Theoretically,the definition of diversity of audio events is summarized,and the diversity of excitation sources and resonant cavity characteristics of audio events is analyzed.In practice,in the aspect of diversity information representation,multi feature fusion is discussed to deal with diversity.In the aspect of diversity information extraction,this paper discusses the performance differences of multi convolution model structure and CNN structure based on category dependence analysis for different audio events,and optimizes the layer by layer supervised training method.(2)The audio event diversity understanding based on attribute space is studiedAiming at the problem of attribute space division of audio events,a multi task learning method based on the attribute space of audio events is studied according to the structural differences and diversity of the excitation sources and resonators of audio events.In theory,starting from the analysis of the excitation source and resonant cavity characteristics of audio events,the generation and structural diversity of various kinds of audio events are analyzed,and the basic attribute space of audio events is divided.In practice,the audio events in the dataset are divided into different attribute spaces,and the multi task learning method is applied to optimize the original layer by layer supervised training framework.On the esc-50 and urbansound8k datasets,89.00%and 84.21%classification accuracy were obtained under the condition of single model and no data enhancement.2.The classification of multi label and pseudo label audio events is studiedAiming at the problem of multi label and pseudo label(machine generated unreliable label)of audio events,a multi label audio event classification method with small amount of supervision is studied based on semi supervised learning,multi task learning and the principle of Laplacian matrix.The diversity of audio events can also include the diversity of data sets.Based on dcase 2019 and dcase 2020,this study explores the use of unreliable label data sets under the structure of multi task learning and semi supervised learning,and analyzes the classification difficulties of multi label data sets,Starting from the loss function,the co-occurrence relationship of audio events is quantified,which improves the modeling ability of co-occurrence relationship of audio events.On the dcase2019 task 2 test set,73.17%classification accuracy is achieved under single model.According to the research method of theory guiding practice and combining practice with theory,this paper summarizes the audio event diversity attribute space,improves the audio event detection and classification system,and makes a preliminary exploration on multi label audio event classification.
Keywords/Search Tags:audio event diversity, deep learning, multi task learning, semi supervised learning, multi label
PDF Full Text Request
Related items