Font Size: a A A

The Blind Separation Of Monaural Speech Based On Computational Auditory Scene Analysis

Posted on:2017-04-01Degree:MasterType:Thesis
Country:ChinaCandidate:R R ZhaoFull Text:PDF
GTID:2308330503457524Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
As the most direct and effective way of communication, the speech is always affected by various kinds of interference and noise from the actual environment. Benefiting from the unique analysis of human auditory system,ears always address every situation properly to capture specific signal keenly. The process of simulating the perception process of human ears and modeling the auditory scene via computers is called computer auditory scene analysis(CSAS), which is always used to achieve speech separation, and gradually becomes the hotspot of speech signal processing in recent years.On the basis of analyzing theoretical knowledge and classical algorithm of CASA in detail, for two cases of interference signals,non-speech noise and other speech, the monaural speech separation based on CASA was mainly studied in this thesis. The main research work is as follows:For the situation of separating speech from non-speech interference, the existing algorithms based on CASA were largely focused on voiced segregation, little attention has been paid on unvoiced speech segregation. Aiming at the problems of large amount of computation and inaccuracy of unvoiced background noise estimation, the improved unvoiced speech segregation algorithm based on the CASA and spectral subtraction was proposed in this thesis. In the improved approach, the rough T-F intervals were measured by estimating onset/offset. Then the noise energy of each T-F unit in the corresponding T-F intervals was estimated respectively based on the principle that the energy of two neighboring T-F units has continuity, which made the noise energy estimation more accurate. The experimental results show that the improved approach has smaller computation and better performance of unvoiced speech segregation.For the situation of separating speech from other speech interference, which is also called two-talker speech segregation, a two-talker speech separation system combining CASA with speaker recognition was proposed in this thesis. The voiced was organized simultaneously using Tandem algorithm. Then the objective function was established via Gammatone frequency cepstral coefficients(GFCC) clustering to realize speaker recognition, and the best group was found through exhaustive search or beam search. So that voiced was organized sequentially. Unvoiced segments were generated by estimating onset/offset, and then the unvoiced-voiced(U-V) segments and unvoiced-unvoiced(U-U) segments were separated respectively. The U-V segments were managed via the binary mask of the separated voiced, while U-V segments were separated evenly. So far, the unvoiced was separated. The simulation and performance evaluation verify the feasibility and effectiveness of the proposed algorithm.
Keywords/Search Tags:Computational auditory scene analysis(CASA), speech separation, spectral subtraction, unvoiced speech segregation, GFCC
PDF Full Text Request
Related items