Study On The Underdetermined Speech Separation Based On Deep Neural Network

Posted on:2018-04-09

Degree:Master

Type:Thesis

Country:China

Candidate:Z G Miao

Full Text:PDF

GTID:2348330536961160

Subject:Electronic and communication engineering

Abstract/Summary:

PDF Full Text Request

Speech is one of the commonly used media for communication.In the actual environment,it will inevitably be affected by noise and speech separation came into being.Speech separation is the process of separating the source signal from the mixed signal,which plays an important role in speech processing systems such as speech recognition,speaker recognition and audio retrieval.The paper is devoted to the research of underdetermined speech separation,especially single channel speech separation.The research work mainly consists of the following aspects:(1)In the paper,a single channel speech separation method combining phase sensitive time-frequency mask and deep neural network is proposed.In the method,the input of the network is the amplitude of the mixed signal and the target output is the proposed phase sensitive time-frequency mask,which introduces the phase information of the speech signal.During the training process,the objective function selects the signal approximation.In the test phase,the source signal is reconstructed with the phase of the mixed signal.Comparing with other methods,the proposed method with phase information has achieved better results.(2)Considering that the complex time-frequency mask can simultaneously restore the amplitude spectrum and phase spectrum of the speech,this paper combines the complex time-frequency mask with deep neural network to solve the single channel speech separation.The mixed signal amplitude spectrum is still used as the input of the network,and the complex time-frequency mask is used as the target output of the network.As the complex time-frequency mask calculation is difficult,the directly using of deep neural network prediction is not accurate.Therefore,the objective function of phase constraint is proposed in this paper,which is helpful to the estimated phase.During the test phase,the estimated phase is used to reconstruct clean source signal.This method can accurately estimate the phase of the source signal,which has better separation results.A series of experiments on TSP speech database are carryied out in this paper.And the results show that the proposed method has better speech separation quality compared with other existing methods.

Keywords/Search Tags:

Speech Separation, Deep Neural Network, Complex Time-Frequencty Mask, Objective Function

PDF Full Text Request

Related items

1	Machine Learning For Underdetermined Speech Separation
2	Speech Separation Technology Based On Deep Learning
3	Research On Design Of Objective Function For Deep Neural Network Based Speech Enhancement
4	Research On Supervised Speech Separation Based On Deep Learning
5	Study On Objective Quality Comprehensive Evaluation Of Face Mask Speech
6	Binaural Speech Separation Research Based On Deep Learning Of Time Series
7	Speech Enhancement Based On Iterative Mask Estimation And Generative Adversarial Networks
8	Time Domain Speech Separation Algorithm Based On Deep Neural Network
9	Research On Monaural Speech Separation Technology Based On Deep Learning Joint Optimization And Feature Fusion
10	Supervised Speech Separation Using The Optimal Ratio Mask