Research On Music Similarity Calculation Method Based On Deep Learning

Posted on:2022-01-19

Degree:Master

Type:Thesis

Country:China

Candidate:H Liu

Full Text:PDF

GTID:2518306494471294

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Music similarity calculation is an important branch in the field of music information retrieval.It has a positive effect on the identification of music plagiarism and other retrieval based on music content.So,music similarity is meaningful.The similarity between music can be summarized as emotional similarity,similar musical theory characteristics,similar genre,etc.In the application scenarios of cover singing and plagiarism detection,it is suitable to focus on comparing music content and similar music theory features.Now,using traditional methods for music similarity comparison is poor in accuracy and inflexible in feature extraction.In response to the above problems,we proposed a music similarity calculation method based on deep learning.We studied the influence of different low-level features and deep learning model structures on the extraction of the main melody of music and the application of deep learning in music similarity calculation.The main contents of this paper are as follows:1.In order to reduce the interference of music information,we extract the main melody of music at first.According to the advantages of convolutional neural network in image processing,we use the semantic segmentation model based on the encoderdecoder structure of convolutional neural network to extract main melody of music.In terms of input,the audio is converted into two-dimensional features which are Generalized Cepstrum(GC)and Generalized Cepstrum of Spectrum(GCOS).Moreover,the Mel cepstral coefficient(MFCC)and Chroma Feature are manually extracted into the input data in a multi-channel manner,so that the input data contains the tone and vocal information.In addition,we try to add a channel-based attention mechanism to the model.Experiments show that the convergence speed of model training is accelerated after adding artificial features.The overall accuracy has improved compare to the baseline after used the multi-feature fusion model which has added the attention mechanism.At the same time,the false alarm rate has a drop.2.Due to the temporality of music and the contextual connection of music,this paper uses a bi-directional long-short term memory network combined with the model structure of the attention mechanism to encode the input data.In terms of input,this article mainly selects the main melody pitch as the main feature,and also uses two important music content features,tone and rhythm.In terms of data,the data is classified into different tonal clusters.Then,we encode these labels into vectors.The data in the same set should be closer.In terms of experiments,this article will be divided into three parts to compare the effects of attention mechanism,distance formula,and music characteristics on the results.And then we show the actual effect of the overall link.Experiments show that the bi-directional long-short term memory network with the attention mechanism can achieve higher accuracy,and setting the loss function as the cosine distance is more helpful to increase the discrimination of clusters;The combination of the main melody and rhythm feature have showed better performance as input data.

Keywords/Search Tags:

PDF Full Text Request

Related items

1	Research On Content Based Music Feature Extraction And Classification
2	Feature Extraction And Classification Based On Music Signal
3	Research On Method For Music-driven Dance Motion Automatic Generation
4	A Research On Music Emotion Analysis Based On Feature Vector
5	Design Of The Content-Based Polyphony Music Retrieval System
6	Research Of Music Emotion Recognition Method Based Music Feature Vector Space
7	The Music&Light Auxiliary Design System Based On Music Feature Recognition
8	The Design And Implementation Of Music Retrieval System Based On Spectrogram Feature Extraction
9	Research On Music Emotion Recognition Method Based On Machine Learning
10	A Real-Time Music Retrival System Based On Melodic Feature