Font Size: a A A

Reaserch On Audio Classification Based On Knowledge Distillation

Posted on:2020-08-17Degree:MasterType:Thesis
Country:ChinaCandidate:L GaoFull Text:PDF
GTID:2518306548994249Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Audio is an important information carrier,and the research of audio signal processing has great value.The deep learning-based method can automatically extract task-related features from massive data and previous works have demonstrated the superior performance in audio classification tasks,it is a hot research direction of audio classification.But deep learning technology often accompanied by huge computational complexity and storage overhead,while acoustic models are usually deployed on embedded devices with limited resources.Resource shortage limits the industrial application of big audio classification models.In addition,the audio data is highly time-series,mixed with noise,and features are not prominent,which makes the accuracy of the audio classification model difficult to be improved.The study of improving the model precision and model compression have important industrial value and academic value.Knowledge distillation is a knowledge transfer method used for model compression and improving model performance.This paper studies the knowledge distillation based deep learning method in the audio classification tasks.For the problem that the deep learning network occupies too many resources and the performance cannot meet the demand,this paper proposes a method of the adversarial feature distillation method which could train high-performance small networks to compress models.Unlike the previous methods of model ensemble or deeper networks which will increase the resource occupancy while improving the network performance.In this paper,the knowledge distillation method is used to transfer the feature map knowledge of complex models to the simple network,which would enhance the performance of the simple network.In addition,using the adver-sarial learning strategy to strengthen the effect of knowledge distillation,strengthen the supervision of feature map learning,which reduces the loss of fine-grained information.On the audio classification task,the adversarial feature distillation method effectively comprssed model size,the computational compression ratio is 76:1 and a parameter compression ratio is 3:1 while the performance of the small model can approach or even exceed the large model.For the problem of a single feature representation of audio features can not provide enough information for the model,resulting in low accuracy.This paper proposes a multi-representation knowledge distillation training framework,which can use multiple audio representations as input and use their complementary Information to enhance model performance.In general,the first step using neural networks for audio classification task is transferring the original audio signal into an advanced feature representation.But the operations such as Fourier transform and discrete cosine transform often used in the transferring process will cause information loss.The network using a single representation as input can only learn part of the original signal.In the previous research,the model ensemble method and feature splicing method were used to fuse the information in different representations to enhance the generalization ability of the model.However,the model ensemble method and feature splicing method led to the increase of computational complexity and higher storage resource.The knowledge distillation-based collaborative learning framework combines the complementary information learned from different model structures and different representations to improve the performance of the audio classification model without increasing resource costs.Experiments on acoustic scene classification and general audio tagging tasks represent the effective-ness of the proposed knowledge distillation framework.
Keywords/Search Tags:AUDIO CLASSIFICATION, KNOWLEDGE DISTILLATION, CONVOLUTIONAL NEURAL NETWORKS, DEEP LEARNING
PDF Full Text Request
Related items