Research On BN Feature Based Acoustic Modeling And Its Application In Keyword Retrieval

Posted on:2016-02-22

Degree:Master

Type:Thesis

Country:China

Candidate:D Y Liu

Full Text:PDF

GTID:2308330467494914

Subject:Information and Communication Engineering

Abstract/Summary:

PDF Full Text Request

Speech retrieval based on large vocabulary continuous speech recognition is an important research direction in multimedia retrieval. The research of this thesis focuses on speech recognition and keyword search of speech retrieval, mainly includes the fol-lowing three aspects:firstly, build an bottleneck (BN) feature based acoustic model for speech recognition; secondly, propose an optimization method of BN feature extractor to further improve recognition accuracy. Thirdly, establish keywords retrieval system and propose optimization methods to improve the retrieval performance.BN feature based acoustic modeling combined the methods and advantages of Gauss mixture model and deep neural network based recognition system. BN features are extacted by a neural network with a narrow hidden layer, and then used to train Gauss mixture model based recognition system, and further enhance the recognition perfor-mance by discriminative training techniques. For the problems in low resource lan-guages, this paper propose some optimization methods to improve the acoustic model, such as using tone features and tone modeling, noise processing, cross lingual e.t.c.To further improve BN feature based recognition systemâ€™s performance, this pa-per propose an method to optimize BN feature extractor. Two different discriminative objective functions, the maximum mutual information criterion (sMMI) and minimum phone error rate criterion (MPE), are adopt to optimize network and model parameters. Two training methods are used to update the network, only update the last layer and up-date the whole parameters of network. The optimization methods using all the training data to calculate gradient,itâ€™s suitable for parallel computing.Considering speech recognition errors, keywords need to be searched on candidate word lattice. This paper adopt the retrieval methods of confusion network and weighted finite state transducer(WFST) to obtain the keywordâ€™s position and score in word lattice. Secondly, the methods of threshold judgment,term dependent and sum to one are used for confidence judgment. Finally, this paper propose the optimization retrieval meth-ods of building keywords related language model, system combination and combining confusion network with WFST, which improved the retrieval performance.Aiming at NIST STD2006, OpenKWS2013and OpenKWS2014retrieval compe-titions, this paper built speech recognition and keyword retrieval systems on English, Vietnamese and Tamil respectively. The experimental results verify the validity of the proposed recognition and retrieval methods in this paper.

Keywords/Search Tags:

Speech Retrieval, Speech Recognition, Deep Neural Network

PDF Full Text Request

Related items

1	Speech Recognition Front-End Processing Based On Deep Neural Network
2	Research On Robust Speech Recognition In Noise Environment
3	Design And Implementation Of Robust Speech Recognition System Based On Deep Neural Network
4	Research On Encrypted Speech Retrieval Method And Index Scheme Based On Deep Hashing
5	Research On Speech Emotion Recognition Model Based On Deep Neural Network
6	Research On Speech Phoneme Recognition Based On Deep Learning
7	Research On Speech Automatic Retrieval Technology For Broadcast News
8	Design And Implementation Of Noise Robust Speech Recognition Algorithm Based On Deep Learning
9	Research On Speech Separation And Recognition Based On Deep Learning
10	Research Of Deep Learning Neural Networks Applications In Speech Recognition