Font Size: a A A

Underdetermined Anechoic Blind Speech Separation Based On Time-frequency Sparsity

Posted on:2011-02-24Degree:MasterType:Thesis
Country:ChinaCandidate:H B YangFull Text:PDF
GTID:2198330338989879Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Toemulatethesophisticationofthehumanauditorysystem, blindspeechseparation,which stems from the research of the"Cocktail Party Effects"and aimed at separatingoriginal speech source signals from the observed mixtures, has been an extensively fo-cused problem in computer audition, speech recognition and speech enhancement.Traditional blind speech separation methods are based on independent componentanalysis, which can only solve the over-determined and even-determined problem. How-ever, the under-determined case is actually ubiquitous in practice. Most of the currentunderdetermined blind speech separation methods focus on the instantaneous model. Inthisthesis, wefocusontheanechoicmodelwhichismorepractical. Tosolvetheunderde-termined blind speech separation problem, the prior information of the speech is neededto convert the underdetermined problem into the evendetermined problem. In this thesis,an underdetermined blind speech separation method based on time-frequency sparsity ofspeech has been designed. The main contents in this thesis are as follow:1. Study the framework of underdetermined blind anechoic speech separation basedontime-frequencysparsity. Undertheframework, weanalyzethetime-frequencysparsityand discuss the separability of the underdetermined problem based on sparse recoverytheory.2. Algorithms analysis and design: The algorithm proposed by this thesis containstwo stages. The first stage is mixing parameters estimation: by analysis using probabilitytheory, we obtained the conclusion that the time-frequency disjointy cannot deduced fromthe time-frequency sparsity, and the assumption that every source can be active alone insomelocaltime-frequencyregionismorelogical. Hence, weimprovedAD-TIFROMandAD-TIFCORR methods which are based on single active region detection. Our method ismore robust and can overcome the problem of AD-TIFROM and AD-TIFCORR methodswhich cannot estimate the delays with fractional samples. The second stage is sourceestimation: thesourceestimationmethodinthisthesisisbasedonsparserecoverymethod.The ADM-BP sparse recovery algorithm has been used to estimate the source signals andcan be more efficient and robust than traditional sparse recovery algorithms.3. Emulational experiments design: A series of emulational experiments have beendesigned to evaluated our speech separation method. First, the mixing estimation method has been evaluated under various delay cases; second, the source estimation method hasbeen evaluated when the mixing parameters are known. The experiments are based onimagetechnique. Twotypesofmicrophonearrayhasbeenused, oneiscomposedbythreeomnidirectional directional microphones and the other is composed two omnidirectionaldirectional microphones and one cardioid microphone. Finally, we have obtained theconclusion that the second type of microphone array is appropriate for our method, hence,this microphone array is used to evaluate our underdetermined blind anechoic speechseparation method.
Keywords/Search Tags:underdetermined blind speech separation, anechoic mixing, time-frequency sparsity, single active, sparse recovery
PDF Full Text Request
Related items