Music Separation Method Based On U-shaped Network And Audio Fingerprint

Posted on:2022-08-12

Degree:Master

Type:Thesis

Country:China

Candidate:K L Liu

Full Text:PDF

GTID:2518306569497494

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

With the development of computer technology,the effect of speech processing has been significantly improved.As one of the basic tasks of speech processing,speech separation has received widespread attention.Explore the music separation method combined with audio fingerprint retrieval.Specific work includes the improvement of the separation model based on U-shaped network architecture and the separation and re-optimization based on background music audio fingerprint retrieval.An audio fingerprint database is constructed to realize audio fingerprint retrieval.First,an end-to-end music and vocal separation model based on deep learning is proposed.The model is designed based on the architecture of the U-shaped network.It is divided into two parts: the encoder part and the decoder part.The two parts adopt the time-series convolution residual module.The first part of the module is a set of one-dimensional waveforms in the time domain.Convolution filters are used to capture different local information in the voice signal,corresponding to different periodic signal components;the second part of the module uses a residual structure to support feature extraction based on deep networks.The experimental results on the public corpus MUSDB18 data set show that the proposed method has some advantages over the traditional timespectrogram-based method and the commonly used waveform-based method in some aspects.In addition,by introducing binary masking as a post-processing module,the separation effect of specific music components is significantly improved.Based on the use of U-shaped network structure to achieve music separation,the use of background music audio fingerprint retrieval technology is proposed to further optimize the music separation effect.The audio fingerprint retrieval module first builds an audio fingerprint database,and uses the landmark fingerprint construction algorithm to convert audio features into fingerprints for the construction and matching of the background music fingerprint database at the re-separation stage.The specific process is as follows: use the background music separated by the U-shaped network to search in the pure background music library,and use the retrieved background music to separate the original audio;the reseparation stage of the retrieved audio clips and original audio Perform alignment operations on time and volume,and subtract the retrieved audio clips from the waveform of the mixed audio to obtain a purer voice.The experimental results show that the proposed music re-separation strategy based on audio fingerprint retrieval can effectively improve the voice separation effect.After the model with a large separation effect is introduced into the retrieval re-separation module,the separation effect gap is significantly reduced,which fully verifies the retrieval based on audio fingerprint the music re-separation strategy is very robust.The main contributions of this subject are mainly in the following two aspects:one is to propose a new music vocal separation model,which has achieved outstanding results in some aspects;the other is the innovative use of audio retrieval technology to further optimize music Separate tasks to achieve a better separation effect.

Keywords/Search Tags:

speech separation, U-shaped network, sequential convolution module, audio fingerprint, landmark algorithm

PDF Full Text Request

Related items

1	Single-Channel Speech Separation Using Sequential Dictionary Learning
2	Research On Speech Separation Algorithm Based On Deep Neural Networks
3	Research On Audio-visual Speech Separation
4	Research On Facial Landmark Detection Algorithm Based On AC-SE-ResNeXt Convolution Neural Network
5	Convolutional Neaural Network For Speech Separation
6	Audio-visual Multimodal Fusion Speech Separation Based On DCNN-BiLSTM And Improved U-Net Network
7	Speech Separation Based On Microphone Array And Deep Learning
8	Research On Audio Visual Fusion Speech Separation Method For Multi-person Dialogue Robot
9	Research On Multi-modal Speech Separation Based On Audio-visual Combination
10	The Research Of Segmented Audio Retrieval Algorithm Based On Audio Fingerprint