Underdetermined Speech Separation Based On Sparse Representation And Deep Learning

Posted on:2018-09-15

Degree:Master

Type:Thesis

Country:China

Candidate:P Zhang

Full Text:PDF

GTID:2348330536962021

Subject:Information and Communication Engineering

Abstract/Summary:

PDF Full Text Request

Audio signals are often disturbed by the environment noises or other unconcerned sources,and the mixture of mutil-sources makes the high-level audio application be difficult,such as speech recongnition.Recovering the sources from the mixture is an important problem to be solved in the field of audio processing.Human can separate sources from mixture easilly,but it is very hard for computer system especially for underdetermined case which means mixture channel number is less than sources.This paper focuses on solving underdetermined audio sources separation problem,it contains the following aspects:(1)For the underdetermined convolutive separation problem,this paper analysises various sparsity inducing functions and proposes an separation algorithm based on lq(0<q<1)norm.This algorithm exploits the strong sparsity inducing ablility to constraint the sparsity of signals in the time-frequency domain.Besides,the low rank prior is adopted for better recovery accuracy.This paper derives an optimazation algorithm based on proximity operator.Experiments on the BSS Oracal copus demostrate that the propoed algorithm can impove the separation quality effectively.(2)For the monaural audio sources separation problem,an separation algorithm based on the time domain convolutional neural networks(Time-CNN),the input and output of which are both in time domain,is proposed in this paper.There are two key ideas behind the time-domain convolutional network: one is learning features automatically by the convolutional layers instead of extracting features such as spectra;the other is that the phase can be recovered automatically since both the input and output are in the time domain.In order to improve the recovery accuracy,a mixing loss function is proposed.Besides time-frequency mask is applied after output for a better hearing feeling.Vast experiments are taken on TSP corpus and the result showes the proposed algorithm can improve monaural audio source separation performance significantly.

Keywords/Search Tags:

Underdetermined Source Separation, Monaural Source Separation, Convolutional Neural Networks, Deep Learning

PDF Full Text Request

Related items

1	Research On Monaural Voice And Accompaniment Separation Using Deep Learning
2	Underdetermined Blind Source Separation Algorithm Based On Deep Learning
3	Electromagnetic Signal Separation Based On Underdetermined Blind Source Separation Algorithm
4	Underdetermined Blind Source Separation Based On Improved K-means Clustering
5	Underdetermined Source Separation And Its Application To Speech Processing
6	The Algorithm Research On Sparse Component Analysis For Underdetermined Blind Source Separation
7	The Study On Underdetermined Blind Source Separation Of Mixed Speech Separation
8	The Research Of Underdetermined Blind Source Separation Under Noise Environment
9	Machine Learning For Underdetermined Speech Separation
10	Research On Underdetermined Blind Source Separation Algorithm Based On Compressed Sensing And Its Application