Acoustic Scene Classification With Sound Field Decomposition

Posted on:2022-10-29

Degree:Master

Type:Thesis

Country:China

Candidate:H C Yang

Full Text:PDF

GTID:2518306524985309

Subject:Master of Engineering

Abstract/Summary:

PDF Full Text Request

Making machine judge its own space after receiving various sounds from the outside world,which is also called the acoustic scene classification(ASC).In recent years,benefit from the breakthrough in computational power and algorithm,the research on ASC has entered a period of fast iteration.At the same time,more and more new problems are found in the process of technical progress,one of which is the processing of stereo data sources.The largest dataset in the field of acoustic scene now is stereo format,while the current mainstream ASC methods still retain the thinking inertia of single channel even when using stereo data source.They do not pay special attention to the additional spatial information from stereo.The first work of this paper is to extract additional spatial information for stereo data sources.By using the phase information contained in stereo,stereo audio is further decomposed into four channels by using the sound field decomposition algorithm called primary ambient extraction(PAE),so as to obtain more spatial information.Although continue using the acoustic features of log mel energy,which dropouts phase information in the past implementation.While the phase information is encoded into the amplitude information of the increased channels in the proposed PAE-based implementation.On this basis,variety of ASC state-of-the-art methods are used to build a number of VGGNet models,and finally ensemble learning is used to get better performance.After migration to different data distribution with a series of experiments,the system using sound field decomposition has a stable performance improvement compared with the unused one.The best group of the settings has improved 17.3% recognition accuracy compared with the baseline system,and ranked fourth in the global event DCASE 2019 ASC subtask.Compared with the case without using the sound field decomposition,two ranking places are improved.Based on the first work,this paper also makes an application exploration of ASC.After merging the original functional redundant datasets into three classes,the low complexity migration is carried out.Firstly,the algorithm of sound field decomposition is optimized,which makes the calculation speed up by nearly 10 times compared with the original implementation.At the same time,the application of sound field decomposition is changed from feature generation to data augmentation to reduce the complexity of features.On this basis,the model parameters are compressed to less than 1% of the original setting.A low complexity stereo ASC system using the fine receptive field resolution provided by the stack of 1�1 residual block is built,with the similar size to the baseline system and error rate less than one third.Finally,a variety of model compression methods are used to verify that the data augmentation of sound field decomposition still works when the model is compressed again.

Keywords/Search Tags:

Acoustic scene classification, Stereo, Sound field decomposition, Low complexity, ResNet

PDF Full Text Request

Related items

1	Acoustic Scene Classification With Mismatched Recording Devices
2	Sound Wave Separation Method In Three Dimensional Acoustic Field With Single Layer Microphone Array
3	Feature Augmentation And Model Build For Acoustic Scene Classification With Multiple Devices
4	Acoustic Scene Classification Via Classifiers Voting
5	Research On Acoustic Scene Classification Based On Convolutional Neural Network
6	Acoustic Scene Classification And Sound Event Detection Based On Deep Learning
7	Acoustic Scene Classification Based On Adversarial Domain Adaptation
8	Study On Reconstructing Three Dimensional Interior Sound Field Based On Spherical Acoustic Holography
9	Research On Sound Classification Model In Few-shot Scene
10	Research On Sound Scene Classification Based On Deep Learning