Font Size: a A A

Research On Automatic Music Annotation And Mood Classification Methods

Posted on:2019-06-29Degree:MasterType:Thesis
Country:ChinaCandidate:Y XiongFull Text:PDF
GTID:2428330545477519Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development of multimedia technologies and applications,digital music has developed rapidly both online and offline over the past few decades,and has become one of the major multimedia resources widely used by people in daily life.At the same time,to efficiently manage,index,search and recommend music of interest from the vast amount of music data,a variety of content-based music infor-mation retrieval(MIR)techniques have been developed.Among them,as an effec-tive measure for music information retrieval,automatically assigning music clips a set of relevant semantic tags is of great importance in many music-related applications such as music recommendation,playlist generation and music similarity measurement.Music tags are descriptive keywords describing the content and attributes of a mu-sic piece,and convey high-level semantic information of the music such as emotion,genre and instrument.As an important type of attribute and semantic tag of music,the mood category of a music piece describes the inherent emotional meaning of the music,and is considered as an important criterion for people managing or seeking for music data.Accordingly,this thesis conducts relevant research on the automatic an-notation and mood classification problems of the music,and proposes corresponding effective algorithms.For the task of music annotation,this thesis presents a content-based automatic music annotation method combining convolutional neural network(CNN)and recurrent neural network(RNN)effectively.Different from most existing music annotation methods based on deep neural network,the proposed method inte-grates multiple 1-D convolutional layers and depth wise separable convolutional layers in CNN,which has better capabilities of learning richer representation of the music from 2-D Mel-spectrograms with fewer parameters and less computational complexity than the commonly employed traditional 2-D convolutional layers,making the learning and inferring of the network more efficient with improved performance.The proposed method also introduces the key Squeeze-Excitation building block of the sophisticated SENet architecture into the CNN model to further enhance its performance.Finally,the proposed method appends a LSTM model on top of the CNN to capture the intrinsic time-varying sequential structure of the music.For the task of music mood classifi-cation,we propose a generative multimodal method for automatically classifying the mood of a music piece based on effective learning of the relevance between the audio and lyrics modalities of music.The proposed method takes the joint distribution of the two modalities,which distinctively captures the intrinsic characteristic of one specific music mood,as the key measure for depicting and discriminating different mood type-s.Accordingly,this thesis presents effective algorithms for computing each elemen-tal probability distributions in the joint distribution,including the world-to-audio and word-to-word correlations in the music as well as a priori probability of lyrics words.A music piece is then classified to the mood category that maximizes the joint probability of different modalities of music data.To verify the effectiveness of the proposed meth-ods,this thesis conducts in-depth experiments on widely exploited music datasets such as MagnaTagATune and MusiClef.The experiment results show that,compared to rele-vant existing approaches,the proposed hierarchical deep neural networks based music annotation method and the generative multimodal music mood classification method effectively improved the annotation and classification performance,and achieved the expected research objective.Meanwhile,the proposed methods have the potential for further improving and applications to other relevant tasks in the follow-up work.
Keywords/Search Tags:Music Annotation, Music Mood Classification, Multi-Media, Relevance Model, Deep Neural Network
PDF Full Text Request
Related items