Machine Learning For Underdetermined Speech Separation

Posted on:2017-07-22

Degree:Master

Type:Thesis

Country:China

Candidate:L L Chen

Full Text:PDF

GTID:2348330488458688

Subject:Information and Communication Engineering

Abstract/Summary:

Speech separation technology plays an important role in many speech processing systems such as speech recognition and speaker recognition. Speech in high quality is not only to meet the demand of human hearing better, but also an important guarantee for the subsequent speech processing. In the actual scenario, signals often interfered by noise, which makes speech separation has become a hot topic over the years. The core of speech separation is to separate the source signals from the observed mixture.This paper focus on the underdetermined speech separation problem. The main research work can be included as following aspects:(1) A novel one source extraction method based on stepwise separation and softmax classifier is proposed in this paper. The algorithm is suitable for extracting the interested signal from mixture. With any utterance from target speaker, called referenced signal, the target signal can be identified by the softmax classifier trained by the single source points from each source. Then the target signal is extracted by the stepwise separation method layer by layer. The proposed method could extract target signal well with low computational complexity and doesn’t require much priori information.(2) Based on strong feature extraction capability and nonlinear mapping ability of deep neural network (DNN), a supervised and strong discriminative single channel speech separation (SCSS) method is proposed. The correlation coefficients and negentropy are added to the objective function in order to reduce the interference. Moreover, a new training strategy based on curriculum learning to further enhance the separation performance is explored. The training samples are sorted by a ranking function firstly and then gradually introduced into DNN training from easy to complex. The experimental results show that the proposed algorithm outperforms the contrasting approaches.(3) This paper combines the matrix factorization (MF) and deep neural network to solve the single channel speech separation problem. First, each source signal is factorized into a dictionary and an encoding matrix. Then a DNN is trained to learn the mapping from mixture data to the encoding matrix. Finally for a test data, the sources can be recovered by the corresponding dictionary and the encoding matrix predicted by the trained DNN respectively. The experimental results demonstrate that the proposed algorithm could provide better encoding matrix and improve the quality of separated speech compared to the conventional MF-based methods at the cost of DNN training.The efficiency of the proposed methods is substantiated by a series of experiments using TIMIT corpus.

Keywords/Search Tags:

One source extraction, single channel speech separation, Softmax, deep neural network, discriminative objective function

Related items

1	Research On Single-channel Speech Separation Technology Based On Deep Learning
2	Research On Single-channel Speech Separation Technology Based On Dictionary Learning And Deep Neural Network
3	Single Channel Speech Separation Technology Based On Deep Neural Network And Research
4	Research On Single Channel Speech Signal Separation Based On Sparse Representation And Deep Learning
5	Single Channel Speech Separation Methods Based On Deep Neural Network
6	Study On The Underdetermined Speech Separation Based On Deep Neural Network
7	Research On Key Technologies For Multi-source Separation With Deep Neural Networks
8	Single-Channel Speech Separation Using Sequential Dictionary Learning
9	Rsearch And Implementation Of Single Channel Speech Separation With Unknown Number Of Speakers
10	Research And Design Of Speech Separation Algorithm Based On Deep Learning