Font Size: a A A

The Speech Enhancement System Based On Binary Mask And Perceptual Wavelet Packet Transform

Posted on:2012-06-19Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y ShenFull Text:PDF
GTID:2218330368492423Subject:Detection Technology and Automation
Abstract/Summary:PDF Full Text Request
In general,speech is often corrupted inevitably by surrounding environment or transmission medium. Interferenced speech signals can not only cause auditory fatigue, but also reduce the performance of the speech signal processing system, such as speech coding, speech recognition. In order to eliminate noise effects, it's very necessary to study on speech enhancement technique.Based on studying on the spectral subtraction methods, we proposed a new speech enhancement system based on binary mask and perceptual wavelet packet transform. The main works are as following:Aiming at the problem that methods of masking property only include the simultaneous masking, this thesis proposes a temporal masking factor to combine simultaneous masking with temporal masking. It's closer to human auditory perception characteristics. To segregate the residual noise from the speech distortion, we define a differential wavelet coefficient as the difference between the wavelet coefficients of the clean speech and the enhanced speech. We treat the differential wavelet coefficients as a linear superposition of speech distortion and residual noise, and define a cost function to combine them. According to the constraint conditions that the energy of residual noise is kept below the masking threshold, we minimize the speech distortion to optimize the gain function. And then we can get the optimal subtraction parameters to enhancing the noisy speech efficiently.Aiming at the problem that the present algorithms can cause the unvoiced speech to be damaged, we propose a speech enhancement algorithm based on binary mask. According to the computational auditory scene analysis, we segment the noisy unvoiced speech to many time-frequency units, and identify each unit as either target-dominated or masker-dominated. The target-dominated units will be retained and the masker-dominated units will be removed. Finally we synthesize the enhanced unvoiced speech and the voiced speech enhancing by the method based on perceptual wavelet packet transform to get the whole enhanced speech.Both subjective and objective evaluation criterions are conducted on the speech enhancing by the methods.Simulation results show that comparing with other methods, the proposed system can better remove background noise, restrain residual noise, and minimize speech distortion, meanwhile protecting the unvoiced speech.At last,this thesis raises the shortcomings of this method and the problems that haven't been solved,and gives the direction of further study and improving.
Keywords/Search Tags:speech enhancement, binary mask, perceptual wavelet packet transform, unvoiced speech enhancement, masking property
PDF Full Text Request
Related items