Font Size: a A A

The Research Of Speech Enhancement Algorithm

Posted on:2007-10-12Degree:MasterType:Thesis
Country:ChinaCandidate:J SunFull Text:PDF
GTID:2178360182996498Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
IntroductionIn the speech communication system, the additive broadband noise has badlyhurt the qualification and the intelligibility of the speech, and broad band noise hasoverlap with the speech signal totally, it is difficult to filter additive broadband noise.The mostly important purpose is to pick-up pure speech signal from the noisyspeech signal. However, it is impossible to pick-up the pure speech because of thestochastic noise. There are two aspect to the speech enhancement. One is to improvethe speech quality, and to eliminate the background noise. Thus, the enhanced speechwill be accepted and sound not very tired. This is a subjective measure. The other isto improve the intelligibility, and this is an objective measure.The problem of enhancing speech degraded by uncorrelated additive noise,when the noisy speech alone is available, has recently received much attention.Speech enhancement has been a classic research topic for its broad application tovarious speech processing tasks.Many developed techniques are found useful for increasing speech intelligibility,reducing perceptual fatigue and improving speech recognition and speakeridentification system. Approaches to retrieve clean speech are plentiful. Among them,spectral subtraction is one of the most prevailing means under a single channelsituation due to its computation efficiency.Spectral subtraction presumes that speech and noise signals are uncorrelated.Subtracting a noise spectral estimate from a noisy speech spectrum can thereforeretrieve the spectrum of clean speech. The enhanced speech is then reconstructed viaan IFFT using the modified magnitude spectrum and the original phase spectrum.This type of subtractive approach generally results in a reasonably clear quality asidefrom annoying tonal noise, which is called "music noise". How to reduce theinfluence of the music noise, especially in non-stationary noise environment, is ofspecial importance.1.Voice Activity Detection (VAD)A crucial component of a practical speech enhancement system is the estimationof the noise power spectrum. A common approach is to average the noisy signal overnon-speech sections. Speech pause detection is either implemented on aframe-by-frame basis. VAD (Voice Activity Detection) ,often called Ending detection.It is a very important tache, confirming the starting point of the input speechpreciously will ensure the upstanding performance of the speech enhancement system.In the single channel system, it is necessary to find out the gap of the speech.Most ending detection methods is to find out some feature of the signal, then tocompare this feature with a certain threshold value, and to check the start point andthe ending point .In the early ending detection, there are a lot of features to choice ,for example , short-time energy,short-time crossing-zero rate and the LPC coefficient.In the G.729B VAD standard, use the LSF,low frequent frame energy ,theshort-time crossing-zero rate and the full frequent sections to analysis the speechframes every 10ms.In this paper, we research a two threshold values VAD based on the short-timeenergy and the short-time crossing-zero rate, and simulate this method. The resultshows that, the two threshold values VAD based on the short-time energy and theshort-time crossing-zero rate can find out the starting point and the ending point, butnot very exactly. Through a lot of experiments and simulations, the author putforward a new two threshold method based on the Teager operators. The experimentsindicate that the new method can detect the starting point and the ending point of thespeech segment much more exactly than the classic one.2. Speech Enhancement using Spectral SubtractionThe basic concept of spectral subtraction is to separate the clean speech fromnoise in the spectral domain. Spectral subtraction offers a computation efficient,processor independent approach to effective digital speech analysis.The classic spectral subtraction fix the spectral minus coefficients, but in thepractice, the coefficients should be changed during the speech enhancement process.In this paper the author put forward a method that the spectral minus coefficientsvaries during the speech enhancement process. Through the experiments and thesimulations it can pick-up more clear speech from the noisy signal.3. Independent Component Analysis and Speech EnhancementBlind signal processing (BSP) has developed rapidly from last ten years in the20th century. "Blind", means that the pre-knowledge of the signal which will be dealtwith is very few, and then to achieve the separation or deconvolution of the mixedsignal. There are three kinds of signal in the BBS. First one is supergaussion signal,such as image signal. Second one is the subgaussion signal, such as speech signal.Thelast one is gaussion signal, such as the noise.Independent component analysis is a kind of BBS. In this paper, the author usethe FastICA to separate the mixed signal.We take the speech signal and the noise into two source signals one issubgaussion, the other is gaussion. If we can separate these two signals, it means thatwe have the noisy speech enhanced.It is very efficient to enhance the noisy speech signal using the FastICA throughthe experiments and the simulations.
Keywords/Search Tags:spectral subtraction, speech enhancement, VAD, two threshold method, Teager energy operator, blind signal separation ICA
PDF Full Text Request
Related items