Research On Zero-Shot Audio Classification Methods

Posted on:2024-06-26

Degree:Master

Type:Thesis

Country:China

Candidate:Z Gu

Full Text:PDF

GTID:2568307136990389

Subject:Information networks

Abstract/Summary:

PDF Full Text Request

With the continuous evolution of machine learning,audio classification has increasingly become a research hotspot with broad applications across various fields.Zero-shot learning provides a solution to the limitations imposed by the scarcity of target audio data in the current stage,which hinders research in the field of audio processing.Its objective is to identify audio class samples that have not been encountered during the training phase,thereby enabling audio classification approaches to reduce dependence on datasets and adapt to a wider range of application scenarios.In light of this,this paper conducts a comprehensive study on the zero-shot audio signal classification method,whose corresponding model includes audio feature extraction,auditory descriptor generation,and zero-shot learning,aiming to predict the audio category of unseen test samples by learning information from visible audio category samples.The primary contributions of this paper are as follows:(1)This paper presents a novel approach based on spectrograms and synthetic classifiers to address the issues of limited representation capability of audio features and insufficient discriminative information learning in zero-shot audio classification.This method generates spectrograms from audio signals,feeds them into a pre-trained model to obtain corresponding feature representations,and employs a synthesized classifier-based method to achieve zero-shot audio classification.(2)This paper proposes a zero-shot audio classification model based on artificial auditory descriptors,aiming to address the issues of high redundancy and modal mismatch in semantic auditory descriptor information.These descriptors,generated through manual auditory annotation,form an artificial auditory confusion information matrix that represents inter-category differences.This approach effectively circumvents the drawback of semantic auditory descriptors containing a large amount of non-audio-related information,thereby enhancing model performance.(3)This paper introduces a zero-shot audio signal classification model based on generative learning,targeting the issue of diversity in unseen audio class samples.In the zero-shot learning task,the generator network produces samples from unseen categories,allowing the classifier to train on the generated data and gain classification abilities for new categories,thereby improving audio classification performance.Finally,we provide experimental results of these models on the ESC-50 dataset,and partial models on the pre-screened subset of Audio Set,along with an analysis of these results.The results demonstrate that our proposed methods enhance the classification performance of audio signals in zero-shot learning to varying degrees.

Keywords/Search Tags:

Zero-Shot Learning, Audio Classification, Feature Extraction, Auditory Descriptors, Generative Learning

PDF Full Text Request

Related items

1	Research On Zero-shot Learning Based On Generative Model
2	Zero-Shot Learning Based On Cross-Modal Feature Synthesis
3	Research On Few-shot Image Classification Algorithm Based On Deep Learning
4	Research And Application Of Few Shot Learning In Audio Event Classification
5	Research On Zero-shot Learning Based On Generative Model
6	Research On Few-shot Image Classification Algorithm Based On Deep Discriminative Feature Learning
7	Research On Zero-shot Learning Methods Based On Generative Adversarial Networks
8	Research On Few-Shot Classification Algorithm Based On Enhance Feature Representation
9	Research On Zero-Shot Image Learning
10	Research On Zero-shot Image Classification Based On Generative Adversarial Network