Font Size: a A A

Research On Confidence Measure Of Speech Recognition

Posted on:2011-01-28Degree:MasterType:Thesis
Country:ChinaCandidate:J LiuFull Text:PDF
GTID:2178360308461334Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
Voice short message input in mobile phone can give people great convenience. It has practical application, but has not yet been well developed. Therefore, voice short message recognition became a hot issue in speech recognition. Because of the characteristics of short message, its recognition can be very difficult. The main problems of voice short message recognition are:the construction of mobile phone speech database, recognition system development, confidence measures of recognition results.The paper researched on voice short message recognition. We constructed a good speech database and text corpus, and built a confidence measure evaluation system. In addition, we gave a preliminary study on imbalanced data set classification.The research focuses on feature extraction and feature selection in confidence measure classification and imbalanced data set classification. The main research contents are described in details as follows:The establishment of a good speech corpus is of great help to the training of acoustic and language model.This paper analyzed three kinds of corpus selection algorithms.According to the characteristics of SMS corpus, a reasonable corpus selection algorithm was developed.6,000 phonetically rich SMS messages were chosen from 500,000 raw SMS materials.In the precondition of all rare triphones are selected out of raw materials, we tried to balance the triphone. The theoretical triphone coverage rate of 6000 SMS reached 93.9%, and the actual coverage rate reached 100%.And we built a more than 300 hours SMS speech database based on the corpus we chose, which involved 200 people.Traditional speech recognition methods based on static features of a word to justify whether the word is correctly recognized or not, which neglected the information carried by its contexts and the surrounding environment. In this paper, a 14.1% word error rate(WER) speech recognizer(SR) is used as the baseline system, and 10-dimension static features achieved 24.9% decline of Classification Error Rate(CER). Context features and dynamic features are extracted in relation to the static features.The total 42-dimension features get a better CER of 7.4% than static features.But not all these features have a positive impact on the classification.Too many features not only take redundant information, but also make the classification process time-consuming.To solve this problem, feature extraction which can extract prime information from original features and feature selection method which can select effective features from the original feature set are proposed in this paper.The experimental results show that context features and dynamic features are effective features for classification, and the features can be considerably compressed through feature extraction and feature selection.The experimental data of confidence measure classification came from the process of speech recognition. As the recognition rate is relatively high, the ratio of correct and wrong the number of samples has reached 8:1.To deal of this problem, imbalanced data set classification is drawn into study. The author carried out a preliminary study IDS classification and used downsampling method to solve this problem. The classification rate of wrong class samples approved a lot while the classification rate of correct class rate only reduced a little.
Keywords/Search Tags:confidence measure, mobile platform, corpus selection, feature extraction, feature selection, imbalanced data set
PDF Full Text Request
Related items