Font Size: a A A

Research On Mongolian Music Classification Based On Transformer

Posted on:2022-12-08Degree:MasterType:Thesis
Country:ChinaCandidate:Y SongFull Text:PDF
GTID:2505306779975819Subject:Telecom Technology
Abstract/Summary:PDF Full Text Request
Under the background of the era of big data,digital music and online music services have developed rapidly,and the demand for Music Information Retrieval(MIR)has continued to increase.Music Genre Classification(MGC)is one of the important research contents in the field of MIR,and plays an important role in many aspects such as automatic music classification,contentbased or semantic-based retrieval.Mongolian music has a long history as a national music genre and contains the essence of traditional Chinese culture,but there are problems of backward inheritance methods and slow development.How to effectively use music classification techniques to classify and study Mongolian music can help the inheritance and development of Mongolian music.Most of the existing music classification methods suffer from information loss,inadequate extraction of features in the process of feature extraction,and inadequate model structure design.In summary,this thesis improves the traditional feature extraction method and conducts an in-depth study of Mongolian music based on a deep learning classification method.The research mainly includes:(1)Constructing the Mongolian music datasetCombined with the cultural background of Mongolian music,this thesis constructs a Mongolian music data set,collects,filters and annotates 1000 Mongolian music works,including 10 different Mongolian music styles,and uses audio slices for this data set.The data set is enhanced in this way,and the audio clips after 30 s,10s and 3s segmentation are counted respectively.(2)Improve the traditional feature extraction methodMusic signals contain complex frequencies and rich semantic information,and how to find effective representations of musical features is the key to the direction of MGC.However,in the current feature extraction stage,most of the extracted features are traditional audio features.In view of the above problems,this thesis improves the feature extraction method on this basis,and proposes to extract wav2 vec features based on self-supervised pre-training model,and compare wav2 vec features with audio.Classification effect of features.The experimental results show that the feature representation based on wav2 vec is more comprehensive,the classification effect is better than that of traditional audio features,and it can effectively represent music features.(3)Compare and analyze the influence of network model structure on music classificationAccording to the temporal characteristics of music data itself,the memory characteristics of Bi-LSTM neural network are used to effectively learn and classify feature sequences.Then,the self-attention mechanism is proposed to improve the classification model.The experimental results show that the improved model can effectively improve the accuracy of music classification.At the same time,in order to further verify the influence of the network model structure on music classification,this thesis designs a comparative experiment based on the Transformer network model structure.The experimental results show that the Transformer network model structure is more suitable for Mongolian music classification than the Bi-LSTM network model structure.(4)Propose a multi-feature fusion methodIn order to enrich the diversity of features,the experiment verified the classification effect of single audio feature and multi-audio feature fusion under the Transformer classification model;verified the classification effect after the fusion of wav2 vec feature and audio feature.The best performance in the classification model proves the effectiveness of the multi-feature fusion method proposed in this thesis.
Keywords/Search Tags:Music information retrieval, Mongolian music, Music classification, Feature fusion, Classification model
PDF Full Text Request
Related items