Font Size: a A A

The Research And Realization Of Monaural Speech Segregation System

Posted on:2016-10-03Degree:MasterType:Thesis
Country:ChinaCandidate:Q Z HeFull Text:PDF
GTID:2308330473454300Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the popularity of smart phones, human-computer interaction techniques once again obtained the opportunity to develop, how to make the human-computer interaction easy and efficient become a research hotspot in recent years. Speech separation is the foundation of many techniques such as automatic speech recognition, automatic language translation, speaker recognition and so on. Speech separation is also the key problem in human-computer interaction techniques. Because a lot of practical application of human-computer interaction techniques only have one voice input device, monaural speech separation system have got the attention of many researchers.On the basis of auditory scene analysis theory, the researchers presented monaural speech segregation system. The speech separation system’s processing of sound is similar to human auditory perception processing of speech signal, achieving the good results after continual improvement.This dissertation introduces the theories and algorithms of CASA-based monaural speech segregation system. Proposes an improved monaural speech segregation system on the basis of the CASA-based monaural speech system of Hu-Wang. The main contributions of this paper are presented as follows:1. In this paper, by using morphological image processing technique, the method for voiced speech segregation was improved. Traditional CASA system using fixed threshold in voiced speech segmentation stage, cause the binary mask usually includes residual noise and broken voice.Pitch detection and target units labeling is based on the binary mask, inaccurate binary mask will seriously affect the voice separation results. This paper uses morphological image processing technique to eliminate noise and complement target speech in high frequency area and low frequency area respectively. Using multiple voice signal under different noise environment to experiments show that this method can effectively improve the proportion of the voice signal and restraining noise in the initial voiced segregation stage.2. We improved the iterative procedure of Hu-wang’s CASA monaural speech segregation system by considering the objective quality of intermediate target speech. Concretely, we compound the result after each iteration to audio file, test the file with p.563, determining whether or not to end the iterative procedure by the test result. Experiments prove that this method deliver better results than traditional CASA system. Firstly, the algorithm based on objective speech quality evaluation use the MOS- LQP score to control the iterative process, makes the system has good scalability. Secondly, p.563 is a more suitable indicator than the astringency of iterative result to control the iterative process since p.563 is very authoritative.
Keywords/Search Tags:Computational Auditory Scene Analysis, Speech Separation, Energy objective estimation of speech quality, Pitch Tracking
PDF Full Text Request
Related items