Font Size: a A A

Music Recognition Based On Deep Network And Hashing Learning

Posted on:2019-11-22Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y YuFull Text:PDF
GTID:2428330545465300Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
In recent years,with the development of the Internet and digital audio technology,Music Information Retrieval(MIR)has become a research hotspot.In this field,the effective recognition of music genres is an important research content.In addition,instrument and emotion recognition are also research hotspots.At present,the main process of the recognition and classification system in the field of MIR is mainly to extract the music features manually first,then the classifier is used to train the model,and finally the music features are input into the built model to recognize and classify the music.However,bottlenecks are encountered in manually extracting features.As a new feature extraction technique,deep learning has made outstand performance in image processing and Natural Language Understanding.Therefore,this paper uses the powerful feature extraction capability of deep learning to discover more suitable music features for music recognition and design different network structures for music recognition task based on these music features.First,considering most of the music genre recognition problems based on the temporal features having poor performance,the Harmonic/Percussion Sound Separation(HPSS)algorithm that considers both the time and firequency characteristics is proposed.HPSS algorithm is used to separate the spectrums of original music signals into harmonic components with distinct temporal characteristics and percussive components with frequency characteristics,which input into Convolutional Neural Network(CNN)combined with original spectrograms.Then,we design the structure of CNN and study the impacts of different parameters on the recognition rate.With the increasing amount of image data,the recognition methods have several drawbacks,such as the low expression ability of visual feature,high dimension of feature,low precision of image recognition and so on.To solve these problems,we proposed a novel hashing method,namely Convolutional Recurrent Neural Network Hashing(CRNNH),to exploit convolutional recurrent neural network to generate effective hash codes.Firstly,the music signal is preprocessed to be Mel-spectrograms.Mel-spectrogram is the preferred input type for music recognition,which input into pre-trained CNN.We extract convolutional feature maps from the multiple convolutional layers of pre-trained CNN.And then we adopt the bilinear interpolation and the similarity selection strategy on feature maps of each convolutional layers to obtain image pyramid representation,which input into Long Short-Term Memory(LSTM)and hash layer.Finally,we propose a new loss function for recognition in softmax layer that considers the quantization error of hash codes generated by the output of the hash layer of the proposed framework,and simultaneously maintains the semantic similarity and balanceable property of hash codes.Experimental results on our dataset demonstrate that the proposed CRNNH can achieve superior performance over other state-of-the-art hashing methods.
Keywords/Search Tags:musical recognition, convolutional neural network, deep hashing, convolutional recurrent neural network
PDF Full Text Request
Related items