Font Size: a A A

Research On Music Source Feature Extraction And Separation Algorithm Based On Deep Neural Network

Posted on:2022-12-12Degree:MasterType:Thesis
Country:ChinaCandidate:C H PengFull Text:PDF
GTID:2518306764967629Subject:Automation Technology
Abstract/Summary:PDF Full Text Request
The purpose of music source separation is to decompose a piece of music into different source signals,and the research of music source separation algorithm based on deep neural network has become the mainstream of this task.By investigating the current state of research on deep neural network based music source separation algorithms,we find that the performance of current music source separation algorithms has entered a bottleneck,and how to increase the extraction ability of the model for music source features and improve the separation performance of the algorithm is urgent to be solved.Accordingly,this thesis focuses on deep neural network based music source feature extraction and separation algorithms,and designs and implements an automatic music source separation system,the main work is summarized as follows.(1)Proposed music source feature extraction and separation algorithm based on skip attention mechanism and amplitude spectrum featuresIn order to alleviate the feature loss problem caused by the encoder downsampling process,a SA-CEDN network model based on the jump attention mechanism is designed.A music source feature extraction module(FEM)is proposed to obtain the multi-scale feature information of the mixed music amplitude spectrogram.A music source feature extraction and separation network(SA-CEDN-4FEM)is constructed by combining the SA-CEDN with FEM.And the effect of the masking type of the model output on the performance of music source separation is also analyzed.In the vocal and accompaniment separation task,the proposed SA-CEDN-4FEM improves the evaluation metrics GNSDR,GSIR,and GSAR by 0.35 dB,0.52 dB,and 0.2 dB for vocals,respectively,compared with SHN-4.And the GSAR of the accompaniment is improved by 0.36 dB.In the four music source separation tasks of Bass,Drums,Other and Vocals,the proposed SACEDN-4FEM network improves 0.2dB,0.22 dB,0.36 dB and 0.15 dB respectively compared to SHN-4.(2)Proposed music source feature extraction and separation algorithm based on self-attentive mechanism and phase spectrum featuresTo obtain the correlation of different frequency features of music sources,selfattentive convolution blocks are designed to construct SAEDN,which further reduces the number of parameters of the model.The phase spectrum of the music is corrected and then fused with the amplitude spectrum information to improve the performance of the model.The original loss function is also improved by combining the time-frequency domain and time-domain loss functions.Compared with SHN-4,the final SAEDN-4FEM has improved the vocal evaluation indexes GNSDR,GSIR and GSAR by 0.39 dB,0.47 dB and 0.13 dB,respectively,and the GSAR of accompaniment by 0.44 dB in the task of vocal and accompaniment separation task.The SAEDN-4FEM improved by 0.32 dB,0.3dB,0.49 dB and 0.17 dB in the four music source separation tasks of Bass,Drums,Other and Vocals respectively.(3)Designed and implemented a deep neural network-based automatic music source separation systemThis system mainly includes subsystems such as music source separation and separation record management.Using the two models mentioned above,separation of vocal sources and accompaniment sources for user-specified music data is achieved and the separated amplitude spectrum is displayed.
Keywords/Search Tags:Music source separation, amplitude spectrum, self-attentive mechanism, phase spectrum
PDF Full Text Request
Related items