Font Size: a A A

Combination and generation of parallel feature streams for improved speech recognition

Posted on:2006-06-28Degree:Ph.DType:Thesis
University:Carnegie Mellon UniversityCandidate:Li, XiangFull Text:PDF
GTID:2458390008468412Subject:Engineering
Abstract/Summary:
The combination of information from parallel features that provide complementary information about the speech signal generally improves speech recognition accuracy. There are two issues associated with parallel feature combination: the specific method of combining the parallel features, and the nature of the parallel features themselves. These two issues jointly determine the performance of an information combination system.; While a great deal of work has already been expended in the area of information combination, the techniques developed in previous studies are not without their drawbacks and could be further improved. First, most prior activity has focused on the combination method itself, with relatively little attention paid to the issue of designing parallel features for optimal combination performance. Because the benefit of combination ultimately derives from the different information provided by the parallel features, a more judicious way of designing parallel features that specifically tailors them to suit the type of combination procedure used should improve the combined recognition accuracy. In addition, conventional information combination methods can be further refined to better utilize the information that is contained in the parallel features.; The work in this thesis addressed the two issues cited above. We describe two new information combination schemes, lattice combination, which operates on the outputs of the individual recognition systems, and weighted probability combination, which operates at an earlier stage where recognition probabilities are evaluated. We also describe a new way to combine directly the probabilities of untied HMM states. Both the lattice combination method and the two new probability combination methods are shown to be effective in providing consistent improvements over conventional combination methods for various speech recognition tasks.; The second part of this thesis concerns a new way of generating parallel linearly-derived features that directly optimizes their performance when combined. These features are designed in a way that maximizes the normalized acoustic likelihood of the observed speech in the combined system, which can be directly related to speech recognition accuracy. Our new parallel feature generation algorithm consistently outperform conventional parallel feature sets in various speech recognition tasks. In addition, we can achieve even further improvement when apply multiple techniques developed in this thesis simultaneously.
Keywords/Search Tags:Speech recognition, Combination, Parallel, Information
Related items