Different from the traditional signal processing methods,more and more researiches in recent years have achieved the purpose of speech enhancement by simulating the human auditory system.Among them,Computational Auditory Scene Analysis(CASA)studies the perceptual processing of sound signals and the multi-sound source separation process from the human auditory system in both physiological and psychological aspects to establish a computer speech processing model.This thesis focuses on the typical Hu Wang model in mono CASA and binaural auditory speech enhancement method and performs a specific simulation analysis on the speech enhancement performance of the two models in multiple scenes.According to the analysis based on theory and simulation experiments,although the HuWang model has certain robustness,due to the selection of auditory cues and the mechanism of seed segments,the model will lose the unvoiced components and can only be used in continuous voiced speech scenes.But most of the actual speech scenes are non-continuous speech with pause segments,and the unvoiced sound without harmonic structure also plays an important role in semantic understanding.The binaural auditory speech enhancement method using spatial cues is suitable for the above scenes and has good effects,but the certain selection of the feature threshold is still difficult to achieve between the denoising performance and the target speech component retention.Therefore,in order to improve the overall performance of CASA speech enhancement system in non-continuous speech scenes containing unvoiced speech,this thesis starts from the perceptual processing of simulated human auditory system and implements an improved CASA speech enhancement system by complementing mono HuWang model and the binaural auditory model.In the improved system,a target speech active detection algorithm based on binaural masking matrix proposed in this thesis is combined with the mono Hu Wang system to expand its applicability in non-continuous speech scenes.Then in the auditory reorganization stage,this thesis adopts multi-feature masking method to complement the mono cues and binaural auditory cues to preserve the unvoiced components and improve the performance of speech enhancement.In the simulation experiments,this thesis firstly analyzes the impact of the thresholds of parameters in the improved CASA system on the performance and selects the relative optimization thresholds.Next,the performance of the target speech active detection algorithm based on binaural masking matrix is analyzed by various evaluation results.Comparing with the monophonic model and binaural auditory speech enhancement method,results show that the overall improved CASA system proposed in this thesis can effectively improve the speech enhancement performance in non-continuous speech scenes including unvoiced and voiced sounds.Finally,the overall function of the improved CASA system is verified by the design and implementation of hardware and system program. |