Font Size: a A A

Deep Attention Based Music Genre Classification

Posted on:2019-12-15Degree:MasterType:Thesis
Country:ChinaCandidate:J N YaoFull Text:PDF
GTID:2428330566984180Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development of Internet technology and multimedia equipments,the number of digital musics is increasing rapidly on various application platforms.And the huge amount of music cause chaos for audiences and those platforms.Obviously,it's impossible to organize and distinguish such a large number of music by manual efforts.Therefore,how to construct a convenient way to deal with this problem is of vital importance but challenging.Music genre is a top-level label which can help audiences to categorize and describe various music,and it can divide music into different groups.For this reason,music genre classification has attracted widely attentions in the field of music information retrieval(MIR).As two crucial components for music genre classification,feature extraction and classifier learning may greatly influence the performance of most classification systems.Traditional methods design feature extraction and classifier learning separately.Feature extraction concentrates on exploring suitable representations from raw audio signals which are expected to be classified in terms of feature vectors or pairwise similarities.Then the extracted representations are fed into classifier to accomplish classification.However,extracting hand-craft features need some complex process,thus it requires researchers to possess expertise in the musical domain.Furthermore,features which extracted for one certain task lack universality since they may have poor performances in other tasks.In recent years,deep learning has been utilized in various fields,such as Computer Vision,Nature Language Process et al.successfully.Thus,many researchers utilize spectrograms of music signals and deep learning models to implement music genre classification.Even though deep learning has been demonstrated its performance in music genre classification,the classification accuracies of existing methods are not satisfying.And this paper concentrates on improving accuracies of classification systems which are based on deep learning.In this thesis,we propose a bidirectional recurrent neural network(BRNN)based model incorporating with attention mechanism and two attention schemes: serial attention model and parallelized attention model.In the serial architecture,the bidirectional recurrent neural network is trained by training datasets and then extract music features from audio signals.The serial linear attention model works on computing attention scores of different representations and generating attention probabilities by normalizing the attention scores.Then the feature representation which is allocated attention is fed into classifier to implement classification tasks.Nevertheless,the performance of serial attention model relies on BRNN.Thus,the attention probabilities may be influenced by the previous output of BRNN.Considering the bottleneck of the serial architecture,the BRNN with parallelized attention model is proposed to modify the serial architecture.In addition,except for the linear attention model,a more complicate CNN attention model is also designed for the parallelized architecture.In this paper,the two classification models are applied to two standard datasets to evaluate their performances.And the experimental results show that comparing with serial attention,parallelized attention models show greater power and have higher accuracies.Moreover,taking STFT spectrograms as input,the parallelized attention model with CNN implementation outperforms the previous works.And the improved classification accuracies demonstrate the effectiveness and efficiency of the proposed models in this paper.
Keywords/Search Tags:Music genre classification, Feature extraction, Recurrent neural network, Attention mechanism
PDF Full Text Request
Related items