Research On Sound Source Separation Algorithm Based On Deep Neural Network

Posted on:2022-11-03

Degree:Master

Type:Thesis

Country:China

Candidate:T H Li

Full Text:PDF

GTID:2518306782452044

Subject:Automation Technology

Abstract/Summary:

PDF Full Text Request

During speech processing,multiple people talking at the same time and their voices mixed together are often encountered.Depending on with or without reverberation,the mixtures can be classified as dry mixture and reverberated mixture.These mixtures can reduce the efficiency and accuracy of the speech processing.Therefore,we would like to separate the clean sounds from the mixture efficiently and quickly.With the rapid development and improvement of(deep)neural networks,many excellent algorithms based on these technologies have been developed in the field of speech separation.These algorithms can be categorized into three classes: deep clustering,semantic segmentation models and ``encoder-separator-decoder'' architectures.In this study,we find that these three classes of algorithms can be abstracted into one,the separation algorithm with reference signal.Based on this algorithm,we try to separate clean sounds form single-channel reverberated mixture.Specifically,the main contributions of this thesis are listed as follows.Firstly,we improve the mothod of generating mixture-clean data pairs.There are very few open source datasets of reverberated mixtures.To generate the reverberated mixtures,we would prefer to convolute clean sounds with impulse responses.Based on the previous works,we refine the generation method of impulse response and improve the generation data pairs by addressing the characteristics of reverberated mixtures.Secondly,we design neural networks for the two targets of generating reference signals and speech separation based on the scheme of reference signal-assisted separation.The reference signals are required to be reverberation-free and characterize the main features of the clean signal.We propose the pre-processing scheme of stride four sampling,and two complementary networks,the reverberation removal network and the mixture separation network,following the requirements and taking into account the complexity,the size of the model and the inference speed.This study also propose a sample rate restoration network for the speech separation with reference signal.This network is designed to make full use of the reference signal and to exploit the hidden information from reverberated mixture.A high-pass filter is appended to the loss function of the sample rate restoration network,which implicitly assign large weights to the high-frequency components,and cancel the negative effects of down sampling,and improve the quality of separations.While training,specialized training strategies are designed according to the features of each task,which maximize the efficiency of the both subnetworks and the combination network.Lastly,we verify the superiority of the proposed method through experiments.We compare the proposed network with the previous networks: SVoice,Su DORMRFNet,LSTMTas Net,Conv Tas Net,DPRNNTas Net,DPTNet.The reverberated mixtures are computed by convoluting clean audios and impulse responses,where the clean audios are picked from Librispeech dataset and the impulse responses are pick from both FUSS dataset and the dataset generated by virtual rooms.In comparison with the inference speed,the inference speed of the proposed network is faster.Comparing with the separation results,the proposed network performs slightly better when the SI-SNR of the input signal is large,and performs better when the SI-SNR of the input signal is small.

Keywords/Search Tags:

Reverberated Mixture, Generation of The Mixture, Neural Network, Reference Signal, Speaker Separation

PDF Full Text Request

Related items

1	The Study Of Blind Source Separation Algorithms Of The Noise Signal
2	Research On Algorithms For Blind Source Separation, Signal Construction And FSK Detection
3	Study On Blind Signal Separation Methods Under Various Mixture Models
4	Study On The Deception Detection Method Identified By The Automatic Speaker Verification System
5	Research On Robust Text-Independent Speaker Identification System In Noise
6	Mixed Voice Blind Signal Separation System Based On Independent Component Analysis
7	Application Of Blind Speech Separation Technology
8	Adaptive Gaussian Mixture Model And Its Application In Speaker Recognition
9	The Study Of Blind Speech Source Separation In Noising Environment
10	Research Of The Pattern Matching Method In Speaker Recognition System