Font Size: a A A

Research On Music Emotion Recognition Based On Multi-Level Features And Fusion Model

Posted on:2024-01-30Degree:MasterType:Thesis
Country:ChinaCandidate:Z P ZhongFull Text:PDF
GTID:2545307142466214Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the continuous deepening of research on music technology,Music Emotion Recognition(MER)has been widely practiced and applied in music recommendation,music therapy,sound and light scene construction,and other aspects.The computer transforms the form in which music expresses emotions into music emotion features through feature extraction techniques,and establishes a mutual relationship between features and emotion models with a recognition model to recognize music emotions.Music emotions are characterized by their complex diversity,and using precise emotional terms to describe music emotions can be limiting.Continuous dimensional emotion models map emotions to any point in a continuous dimensional space,thus providing the ability to recognize rich and nuanced emotions.This thesis focuses on the study of the V-A continuous dimensional emotion model from the perspectives of music emotion features and music emotion recognition models.The main research content includes:(1)Previous studies on music emotion recognition combined extracted features into a set of music emotion features through concatenation and directly input them into the recognition model,without verifying the correlation between the features and the label truth.This can lead to extended training times and reduced accuracy in recognition.To address this,based on human cognition,the music emotion features were classified into two categories: lowlevel and mid-high level.Low-level music emotion features were extracted from audio files based on the eGeMAPS feature set,while mid-high level music emotion features were extracted based on the music composition elements.As the extracted low-level music emotion features are standardized,the thesis only used feature selection methods to filter out mid-high-level music emotion features that are highly correlated with emotions to obtain the optimal subset of mid-high-level features.(2)Most researchers focus on either the low-level or mid-high-level music emotion features alone when studying music emotion recognition tasks,without considering the multi-level nature of music emotion features.To address this,the low-level music emotion features extracted from(1)were fused with the selected mid-high-level music emotion features,resulting in a multi-level music emotion feature set.Using the SVR model,the effectiveness of the music emotion feature fusion before and after was validated based on the recognition accuracy results.The experimental results showed that the recognition accuracy of the multi-level music emotion features after fusion was the best.(3)Based on the multi-level music emotion features,the thesis simulated the process of human perception of music to express emotions.To address the problems of long-distance dependencies and low training efficiency in music emotion recognition with long-short term memory neural networks,a new fusion model called CBSA(CNN BiLSTM Self Attention)was proposed.This model combines CNN-BiLSTM and self-attention mechanisms and is applied to long-distance music emotion recognition regression training.The model uses a two-dimensional convolutional neural network to extract local key information of music emotion features.It uses a bidirectional long-short term memory neural network to obtain sequence-based music emotion features from the local key information.The self-attention mechanism is used to adjust the dynamic weights of the obtained serialized information,highlighting the global key points of music emotion.The experimental results showed that the CBSA model can shorten the time to analyze the regularity of music emotion data,and effectively improve the recognition accuracy of music emotion.
Keywords/Search Tags:Music emotion recognition, Music emotion feature selection, Two-dimensional Convolutional neural network, Long-term and short-term memory neural network, Self-attention model
PDF Full Text Request
Related items