Single-Channel Speech Separation Using Sequential Dictionary Learning

Posted on:2016-08-18

Degree:Master

Type:Thesis

Country:China

Candidate:Y F Xu

Full Text:PDF

GTID:2308330467494915

Subject:Signal and Information Processing

Abstract/Summary:

Speech separation algorithm as an efficient method which can recover the underlying sources from the mixture has been getting increased attention. The numbers of intelligent terminalâ€™s microphones are often less than the numbers of the speech sources and in extreme conditions there is only one microphone is available. Hence the single-channel speech separation (SCSS) technology is becoming more and more important.Recently, dictionary learning algorithm is more and more used to solve the SCSS problem. The existing dictionary learning algorithms assume that speeches from different speakers have their unique components which makes that the speeches can be sparsely represented by different dictionaries. As we all know speech signal is short-term stationary and should be decomposed into series of short segments, but at the same time the correlation between different speakersâ€™speech signals has be increased, that is to say the speech segment of different speakers may have similar components. This paper propose a novel sequential discriminative dictionary learning (SDDL) algorithm which is used to separate the mixed signal in SCSS system and a new speech post-processing framework which is used to improve the quality of separated speeches. The main content and innovation points are as follows:1. We take the unique and similar components of different speakersâ€™signals into account, and design a multi-layer sequential structured dictionary which contains discriminative and buffer sub-dictionaries in each layer, In the traning stage, a proper objective function is derived, which guarantee that the unique components of the training sets can be explained by their corresponding discriminative sub-dictionaries and the similar components of the training sets can be explained by the buffer sub-dictionary rather than the cross sub-dictionary. The components distributed in the buffer sub-dictionary are used as training sets in the next layer. And in the separation stage, the unique components of different speakersâ€™signals which have better correspondences to the speakersâ€™labels are firstly separated, and the similar components of different speakersâ€™signals are separated in the next layer. Experiments results verify that the proposed sequential discriminative dictionary learning bansed SCSS algorithm can effectively reduce the source confusion compared to the existing algorithms. 2. Because the separated speeches are still mixed with other sources and there are a certain degree of distortion. So a new speech post-processing framework is proposed which contains an adaptive separation module which is used to suppress the influence of the mismatch of the training sets and the test sets, a time-frequency filter module which is used to further suppress the confusion of separated speeches, and a harmonic regeneration module which is used to suppress the distortion of the separated speeches. Experiments results show that the proposed post-processing framework can improve the quality of separated speeches.

Keywords/Search Tags:

Single-channel speech separation, sequential discriminative dictionarylearning (SDDL), speech post-prosing, time-frequency masking, harmonicregeneration

Related items

1	Speech Enhancement Based On Sparse Representation And Dictionary Learning
2	Machine Learning For Underdetermined Speech Separation
3	Research On Speech Enhancement Algorithms Of Microphone Array Based On Time-Frequency Masking
4	Study On Speech Separation And Speech Enhancement Methods
5	Blind Separation Of Multiple Speech Signals
6	Research On Single-channel Speech Enhancement Method Based On Deep Neural Networks And Time-frequency Masking
7	Research On Underdetermined Convolutive Speech Signal Separation Methods
8	Research On Underdetermined Blind Speech Separation Based On Sparsity In Time-frequency Domain
9	Research On Two Methods Of Single Channel Speech Separation
10	Research On Single-channel Speech Separation Technology Based On Deep Learning