Font Size: a A A

Speech Enhancement Research Based On Sparse Representation

Posted on:2014-06-10Degree:MasterType:Thesis
Country:ChinaCandidate:H ZhouFull Text:PDF
GTID:2268330425483754Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
As for the voice communication system, the voice signal is inevitably subject tothe interference of noise signal in the case of real environment. The presence ofbackground noise seriously led to degradation in voice quality and affect people’sability for recognizing the voice information. In order to improve voice quality andobtain clean voice from the noisy speech, speech enhancement is needed.Speech enhancement is an effective method for eliminating noise and solvingnoise pollution. Its primary goal is to extract pure voice from the background noise asmuch as possible and enhance the useful voice signal. In addition, thespeech enhancement can also improve the quality and intelligibility of the voicesignal. From the point of domestic and foreign research results, some speechenhancement methods can reduce the background noise and i mprove the quality ofvoice signal, but still will leave a lot of noise in non-voice segment, and also producespecific noise and voice distortion. As a result, enhanced voice is larger differencewith the original clean voice.In order to solve the above problems, this paper combines speech enhancementand sparse representation theory, and propose a novel adaptive spectral subtractionwhich is based on sparse representation with DCT orthogonal basis. Thereconstruction of voice in this algorithm needs choose right DCT vector to represent apure voice according to the amplitude and characteristics of environmental noise, andvoice quality will be worse because of excessive vector. In order to choose rightvector adaptively, this paper presents a sparse soft threshold to solve the adaptiveproblem. The idea of the algorithm is described as below.First, this paper use VAD to estimate noise variance of the non-voice segment inthe training phase, and take this variance as sparse initial threshold. This paper inputthis threshold into DFOA to estimate global optimal sparse soft threshold. DFOA is animproved algorithm of FOA. It effectively make up for the defects of which is easy toconverge to local extreme.Secondly, in the voice reconstruction phase, by using the optimal sparse softthreshold, this paper can effectively control the reconstruction algorithm on selectingthe DCT coefficient vectors, namely selecting an appropriate vector to represent aclean voice, which reaches the purpose of compressing even removing noise. In order to reduce the error between the reconstructed speech and the original clean speech,this paper proposes a general A*OMP algorithm as a reconstruction algorithm.Compared to the A*OMP algorithm, the proposed algorithm has improved so muchin voice reconstruction’s accuracy and speed. The reconstruction accuracy of theproposed algorithm is the same as BP algorithm.Finally, because the choice of the DCT coefficient vectors is not optimal, thereconstructed voice will still have a small amount of residual noise. In order to furthercompress the reconstructed noise, it needs to subtract noise spectral fromreconstruction voice.Gaussian white noise and colored noise experimental data prove that in the lowSNR environment, the proposed algorithm can effectively filter the noise in thenon-voice segment and compress the noise in voice segment, thus improve the voice’squality and intelligibility.
Keywords/Search Tags:Speech Enhancement, Sparse Representation, General A*OMP, DCTorthogonal basis, DFOA
PDF Full Text Request
Related items