Research On Time Domain End-to-end Detection Algorithm Of Music Source Separation Based On U-shaped Network

Posted on:2022-06-18

Degree:Master

Type:Thesis

Country:China

Candidate:J T Bai

Full Text:PDF

GTID:2568307034474644

Subject:IC Engineering

Abstract/Summary:

Music source separation is one of the most important research topics in the field of music information retrieval.Its main goal is to extract one or more target sources and suppress other sources and noise at the same time.As a preprocessing step of a large number of music information retrieval tasks,result of music source separation has a great influence on the subsequent tasks,and therefore has important research value.Traditional music source separation methods have faced some problems,such as hypothesis dependence,limited model complexity,and the lack of representation ability.To resolve these problems,the end-to-end time-domain deep learning network model takes a long time to be trained,and the separation performance still needs to be improved.In order to further modify the representation ability and computational efficiency of the end-to-end time domain separation model,based on the state-of-theart Demucs model in time domain separation at present,we proposed an end-to-end network Unet-SE-BiSRU.The model proposed in this thesis is mainly improved at the following three points.Firstly,the bidirectional long short-term memory was refined to a bidirectional simple recurrent unit,which reduces the amount of model parameters,improves the parallelism of learning further,and greatly reduces the total training time of the model.Secondly,the attention mechanism is introduced in the generalized coding and decoding layer,and the squeeze excitation block is used to extract features selectively according to the type of audio to be separated,so that the waveforms of different target audio sources can be represented more precisely and the separation performance can be improved.Finally,after one-dimensional convolution,group normalization is added to address the problem of gradient explosion or disappearance in the process of learning,so as to accelerate the convergence of the model.Through comprehensive data experiments,the optimal parameters of the model were determined,three refined points were verified,and performance of our model was compared with the current optimal end-to-end model demucs and other typical models in this field in the MUSDB18 database.The experimental results show that the average measure of signal to distortion ratios of the improved network model is improved by0.34 DB,which is the best separation performance among the end-to-end time domain methods to the best of our knowlege at present,and the training time is decreased to 2/5of the original model.In addition,as to drum and bass sound source,the model has the best separation performance,and is comparable to the optimal separation model according to average signal-to-noise ratio.As the number of channels in the model can be further increased under the same computing power constraint,the model has great potential in performance improvement.

Keywords/Search Tags:

Audio signal processing, Music source separation, U-net, Simple recurrent units, Squeeze-and-excitation, Group normalization

Related items

1	Optical Music Recognition Algorithm Combining Multi-scale Residual Convolutional Neural Network And Simple Recurrent Units
2	High-resolution sinusoidal analysis for resolving harmonic collisions in music audio signal processing
3	Research For Self-attention Based Audio Source Separation Model
4	Source-specific learning and binaural cues selection techniques for audio source separation
5	Research On Underdetermined Convolutive Blind Source Separation Algorithm And Application In Audio Signal Processing
6	The Study Of Dual-channel Speech Separation Technology For Smart Mobile Devices
7	Blind Source Separation And Its Application In Electrocardiography And Speech Signal Processing
8	Research On Acoustic Scene Classification Using Deep Learning
9	Research On Music Source Feature Extraction And Separation Algorithm Based On Deep Neural Network
10	Research On Blind Source Separation And Its Application In Flaw Signal Processing