Computational Auditory Scene Analysis Based Voice Pretreatment System

Posted on:2014-08-24

Degree:Master

Type:Thesis

Country:China

Candidate:M Liu

Full Text:PDF

GTID:2298330422490582

Subject:Information and Communication Engineering

Abstract/Summary:

With the development of modern communication technology, the kind of noise andinterference becomes more and more complex. The speech signals are often influencedby the environmental noise, which results in a sharp decline in voice quality. Theconventional speech recognition systems, such as IBM ViaVioce, have a goodrecognition performance under clean speech signals or high signal-to-noise ratio (SNR)scenarios. Unfortunately, in many cases, due to the noise and interference in a mobileenvironment, SNR may be lower than the detection threshold. This will lead toperformance degradation in these recognition systems. Therefore, how to improve theanti-jam capabilities of the speech recognition system under mobile environmentbecomes an urgent problem to be solved.Most of the existing speech recognition techniques are based on pattern recognitionbut regardless of the noise reduction process of speech signals. To cope with this issue,we devise a new speech recognition system that is based on the computational auditoryscene analysis (CASA). Unlike the conventional methods, the proposed system adds aCASA speech preprocessing module before the speech recognition engine, thus, it canimprove the speech recognition accuracy in mobile environment. By utilizing theinteroperability channel correlation and time-domain continuity properties, all the audioelements that corresponding to a same source can be merged into one fragment so as toseparate the speech signal of interest. Furthermore, we utilize the hidden Markov modeltoolkit (HTK) to build a Chinese speech database, and employ the endpoint detectionalgorithm to extract the Mel frequency Cepstral coefficients (MFCC) feature. In the end,we perform grammar training via combining revaluation algorithm and the MFCCfeatures. Then use the hidden Markov model (HMM) to build the CASA-based speechrecognition system.Simulations are taken to verify the effectiveness of the proposed system. In thesimulation, we use two types of noise, that is, the road noise and indoor cafÃ©noise. Wetest the robustness of the proposed scheme under different SNR environment. From thesimulation results, we find that our approach is more robust than the conventionalmethods in terms of identification accuracy, especially under low SNRs.

Keywords/Search Tags:

speech processing, speech recognition, speech separation auditory sceneanalysis, HTK

Related items

1	Research On Multi-Speaker Speech Separation And Speech Recognition In Noisy Environment
2	Method And Implementation Of Monophonic Double Speech Separation Based On Auditory Scene Analysis
3	Chinese Speech Recognition Technology And Its Application In Speech Separation
4	The Research Of Key Techniques Of Speech Separation And Speech Recognition
5	Speech Signal Processing Based On Auditory Neural Mechanisms
6	Speech Separation Research Based On Human Auditory Characteristics
7	Study On Speech Enhancement And Separation
8	The Blind Separation Of Monaural Speech Based On Computational Auditory Scene Analysis
9	The Research And Realization Of Monaural Speech Segregation System
10	Research On Speech Separation And Recognition Based On Deep Learning