Font Size: a A A

Research Of Speech Separation Based On Binaural Spatial Information

Posted on:2016-06-08Degree:MasterType:Thesis
Country:ChinaCandidate:X X LiFull Text:PDF
GTID:2308330503477812Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Speech separation technology, especially the technology based on human auditory, has an important position in the field of speech enhancement, speech recognition and hearing aid. This dissertation is based on binaural spatial information, combined with the sparseness of speech signals. It presented two speech separation methods based on binaural localization:speech separation based on time-frequency masking and speech separation based on compressive sensing.For mixed speech separation based on spatial information for multiple sound sources, the main work of the dissertation is summarized as follows:(1) The research ideas of speech separation based on binaural localization of is analyzed. Spatial hearing is an important feature of human auditory. In acoustic environments, human auditory system first integrate the spatial information of sound sources, then the information going through the brain central nervous system, which can proceed the localization and separation. This dissertation presented an idea that first localizing multiple sound source localization by binaural localization algorithm, then separating the mixed speech according to the localization result. The procedure is on the basis of characteristics of human auditory.(2) Binaural sound source localization algorithm is studied. Establishing orientation mapping model is realized by training the two binaural sound source localization cues, ITD and IID. The localization method is firstly extracting the parameter of the mixed multiple sound sources speech, then according to the determination of ITD and joint determination of IID, getting the results of multiple sound sources localization, including number of sound source and their corresponding angle on horizontal plane.(3) Speech separation method based on time-frequency masking is presented. In acoustic environments, at one single time-frequency point, the most powerful source will mask the less powerful interference signals and eventually only one source will dominate the single point. This method which is based on frequency-domain sparsity of speech signal, according to the results of sound localization, classifies all-time frequency point from its nearest source, then converts all time-frequency points of each sound source to time domain, obtaining the separated speech source. Experimental results show that the sound source localization algorithm has high positioning accuracy of single and multiple sound sources.(4) Speech separation method based on compressive sensing is presented. This dissertation has analyzed the binaural mixed model of multiple sources, it has the same form of compressive sensing mixed model. As a consequence, the method using binaural sound source localization results, dictionary information of sound sources and mixed speech signals builds out compressive sensing model. It uses OMP algorithm for speech reconstitution. Experiment results show that, the speech separation indicators SIR, and SNR obtains effective improvement compared to time-frequency masking method.
Keywords/Search Tags:Speech Separation, Blind Source Separation, Binary Masking, Compressive Sensing
PDF Full Text Request
Related items