Research And Application On Simultaneous Recognition Of Both Speech And Speaker | Posted on:2016-08-14 | Degree:Master | Type:Thesis | Country:China | Candidate:J Fang | Full Text:PDF | GTID:2298330467961850 | Subject:Computer application technology | Abstract/Summary: | PDF Full Text Request | With computer technology widely used, more and more people have been paid attentionon speech recognition technology. Speech is one of the most popular human-computermethods. And speech recognition technology is critical to the man-machine voice interaction.For certain environments, we need to some methods which not only can accurately identifyspeech and speaker of voice, but also can be applied in embedded systems, such as in car andintelligent home system. In this paper, we mainly analyze speech recognition as well asspeaker identification applied in intelligent home system. Our research mainly includes:(1) Study on voice activity detection and feature extraction, which used forpreprocessing of voice signal. With propose of speech recognition and speaker’s groupidentification at the same time, we explore several speaker adaptive methods and further studythe mechanism of speech and speaker simultaneous recognition which proposed by Herbig in2011.(2) Based on Bagging and GMM, which integrated ensemble learning and speechrecognition, improves speech recognition rate and stability. In order to reduce spaceconsumption, we use SQ (Soft Quantization) for integrating speech models which makesspeech recognition system more suitable to embedded system with limited resources.Compared with voting mechanism, this method can improve speech recognition rate andstability in the case of a small amount of speech models. With propose of speech recognitionand speaker’s group identification at the same time, we use SQ to integrate speech models andspeaker’s group models so that we can real-timely computing optimal decoder for each frameof voice and vote for model with highest SQ score. Through compare vote of models tocomplete speaker’s group identification, meanwhile, use optimal decoders to complete speechrecognition. When we integrated6speech recognition models, the average of speechrecognition rate reached88%and the average of speaker’s group recognition rate reached81.56%. The experimental results confirmed feasibility of speech and speaker’s group aresimultaneously recognized in certain environments.(3) In the intelligent home environment, we use method of speech and speaker’s groupsimultaneous recognition for realization of speech and speaker simultaneous recognitionsystem. When we integrated5speech recognition models, the speech recognition rate reached96.64%and the speaker’s group recognition rate reached88.24%. The experimental resultsshow that this method is suitable for speech and speaker simultaneous recognition in theintelligent home environment. | Keywords/Search Tags: | speech recognition, speaker identification, ensemble learning, speaker’sgroup recognition, SQ(Soft Quantization), voting mechanism, embedded system, Bagging | PDF Full Text Request | Related items |
| |
|