Font Size: a A A

Machine Learning For Underdetermined Speech Separation

Posted on:2017-07-22Degree:MasterType:Thesis
Country:ChinaCandidate:L L ChenFull Text:PDF
GTID:2348330488458688Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Speech separation technology plays an important role in many speech processing systems such as speech recognition and speaker recognition. Speech in high quality is not only to meet the demand of human hearing better, but also an important guarantee for the subsequent speech processing. In the actual scenario, signals often interfered by noise, which makes speech separation has become a hot topic over the years. The core of speech separation is to separate the source signals from the observed mixture.This paper focus on the underdetermined speech separation problem. The main research work can be included as following aspects:(1) A novel one source extraction method based on stepwise separation and softmax classifier is proposed in this paper. The algorithm is suitable for extracting the interested signal from mixture. With any utterance from target speaker, called referenced signal, the target signal can be identified by the softmax classifier trained by the single source points from each source. Then the target signal is extracted by the stepwise separation method layer by layer. The proposed method could extract target signal well with low computational complexity and doesn't require much priori information.(2) Based on strong feature extraction capability and nonlinear mapping ability of deep neural network (DNN), a supervised and strong discriminative single channel speech separation (SCSS) method is proposed. The correlation coefficients and negentropy are added to the objective function in order to reduce the interference. Moreover, a new training strategy based on curriculum learning to further enhance the separation performance is explored. The training samples are sorted by a ranking function firstly and then gradually introduced into DNN training from easy to complex. The experimental results show that the proposed algorithm outperforms the contrasting approaches.(3) This paper combines the matrix factorization (MF) and deep neural network to solve the single channel speech separation problem. First, each source signal is factorized into a dictionary and an encoding matrix. Then a DNN is trained to learn the mapping from mixture data to the encoding matrix. Finally for a test data, the sources can be recovered by the corresponding dictionary and the encoding matrix predicted by the trained DNN respectively. The experimental results demonstrate that the proposed algorithm could provide better encoding matrix and improve the quality of separated speech compared to the conventional MF-based methods at the cost of DNN training.The efficiency of the proposed methods is substantiated by a series of experiments using TIMIT corpus.
Keywords/Search Tags:One source extraction, single channel speech separation, Softmax, deep neural network, discriminative objective function
PDF Full Text Request
Related items