Application Of Deep Learning In Music Automatic Tagging

Posted on:2018-04-01

Degree:Master

Type:Thesis

Country:China

Candidate:Q Gong

Full Text:PDF

GTID:2348330512492056

Subject:Communication and Information System

Abstract/Summary:

PDF Full Text Request

The traditional way of music automatic tagging follows such routine:beginning with a group of labeled dataset,extract audio feature of each song,then modeling different label with different model,but this approach seems redundant.In another aspect,the emergence of bigger music automatic tagging dataset has gradually changed the way of model design along with the rise of deep learning in recent years.In this paper,we utilized deep learning and big dataset to provide more concise and accurate tagging models.Specifically,we designed three different model structure correspond to different feature inputs(mel-spectrogram,spectrogram,MFCC and raw audio),and we evaluated their performance on Maganatagatune dataset for the convenience of performance comparison since lots of previous work was based on this dataset.The result shows that the model of raw audio and mel-spectrogram performs much better than spectrogram and MFCC.Then we visualized the strongest response of pre-trained mel-spectrogram model through gradient descent given random noise as input.Then,to compare model of different depth,we utilize the subset of MSD(Million Song Dataset)the lasm.fm dataset.The result shows that deeper model obviously outperformed shallow model,which agrees to latest work in computer filed.Such result also imply the importance of dataset size on the performance of deep learning model.Our main contribution is as follow:(1)We designed several deep learning model for music automatic tagging and used several musical feature as model input.The result shows that raw audio and mel-spectrogram provide much better performance than spectrogram and MFCC.Meanwhile as compare to previous work,our raw audio model achieved better AUC than previous work.(2)We compared the results of model with different depth using bigger dataset and it shows that deeper model significantly outperformed shallow model,which agrees to the latest work in computer vison.As we compared the results of different depth between different dataset,it is obvious that that size of dataset can severely affect the performance of the model,dataset with bigger size is more likely to explore the potential of a model.(3)We visualized the strongest input response of each layer's each filter in mel-spectrogram model,and we found the frequency response is align to the human perceptual scale.

Keywords/Search Tags:

deep learning, music, automatic tagging

PDF Full Text Request

Related items

1	Application Of Deep Learning In Music Automatic Tagging
2	Deep Analysis On Labels For Music Auto-Tagging
3	Design And Implementation Of Clothing Image Auto-tagging System Based On Deep Learning
4	Learning From Limited And Imperfect Tagging
5	Research On Music Similarity Calculation Method Based On Deep Learning
6	Research On Part-of-Speech Tagging Algorithms Of Mathematical Corpus Based On Deep Learning
7	Research On The Separation Algorithm Of Instrumental Music Based On Deep Learning
8	An exploration of deep learning in content-based music informatics
9	Study Of Automatic Tagging System
10	Music Auto-tagging Based On Generative Adversarial Network