Font Size: a A A

Research On Isolated Speech Recognition In Noise Environment

Posted on:2019-10-21Degree:MasterType:Thesis
Country:ChinaCandidate:X Y ZhangFull Text:PDF
GTID:2428330563499111Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Humans routinely recognize speech in challenging acoustic environments with background music,engine sounds and other acoustic noise,but today's Automatic Speech Recognition(ASR)system does not perform well in these environments.Many ideas have been found in recent experiments and theory researches can be used to solve ASR defect problem.This paper is based on the biologically ASR method to study the ASR robustness in noisy environment.First,the theory of Spectrotemporal Receptive Field(STRF)is studied in this paper.By comparing the recognition function based on the objective function of STRF and ETSI in different SNR environment,the experimental results show that stability of the STREF of model with auditory neuron STRF is improved in the noisy environment.However,its overall performance has not been improved significantly.Secondly,using the speech representation of the spike model neuron,the neuron in this method is a feature detector that selectively responds to the temporal features within a short time window of speech.A method for training neuron response characteristics using a support vector machine(SVM)was proposed,and the STRF of a neuron is calculated.Compared with the previous physiological results,it is proved that the spike sequence in the neural population can improve the robustness.Two methods are used to decode the peak-based speech representation.The first method uses classical ASR techniques based on Hidden Markov Models.The second method is an improved template-based recognition method that uses neural representation noise invariance.This method is based on the speech similarity measure of the longest common subsequence among the spike sequences.Orthogonal optimization experiments were performed in different SNR environments.Experimental results show that the combination of optimal performance is based on the neuron-based speech characterization and improved template-based recognition method.Finally,inorporatingc syllable information into the ASR system and then using the syllable detection method of the marker syllable nuclear position to decode the spike representation of the continuous speech which combines SVM-based training with a peak selection algorithm designed to improve noise margin.Using this method and traditional methods to perform continuous speech decoding in different SNR environments.Experimental results show that this method can effectively improve the recognition rate under noise conditions.However,the recognition rate of this method under noise-free conditions is lower than traditional method.
Keywords/Search Tags:ASR, STRF, Spike Sequence, Template Ecognition, Syllable Detection
PDF Full Text Request
Related items