Font Size: a A A

Research On Long Distance Language Mode Based On Word Activation Force

Posted on:2015-06-12Degree:MasterType:Thesis
Country:ChinaCandidate:M QinFull Text:PDF
GTID:2298330467962376Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Language model is a key technology of speech recognition, also is the research hotspot. Language model has been facing two major problems, i.e. problem about data sparseness and long distance dependent information loss problem. In the paper, using the language modeling and smoothing based on Word Activation Force to mine the syntactic information and semantic information of the corpus data. The experiment has proved the Word Activation Force can be efficiently mine the latent semantic information in big data, and the semantic information meets the expression rule of natural language. In this paper, the main research work and innovations are described as follows:1. Propose Word Activation Force based long distance language modeling and smoothing.In the paper, using Word Activation Force for language modeling firstly, and then interpolate the long distance model with conventional n-gram language model. Experiments have shown that the long distance dependency language model based on Word Activation Force contains global information as well as local information, which is excelling for text modeling. Use Word Activation Force to extract text long distance information to language model smoothing. Experiments show that the Word Activation Force based smoothing method words well in speech recognition. 2. Research on language smoothing based on latent semantic analysis.Use latent semantic analysis to language smoothing, which uses singular value decomposition algorithm to obtain a low dimensional latent semantic space from the item documents high dimensional space, and then extract high similarity of words for smoothing. Experimental results show that the method has advantage in addressing the issue of data sparseness.3. Improving cluster language model based on Word Activation Force affinity.Cluster base on Word Activation Force affinity, and then extract k-top similarity words from the cluster matrix to improve traditional cluster language modeling, which has an obvious advantage in addressing the issue of data sparseness. The experimental result of the semantic smoothing language model based on Word Activation Force shows that the smoothing method retains the high distinction between adjacent words while efficiently predicts the unseen word sequence.
Keywords/Search Tags:speech recognition, word activation force, long distancelanguage model, latent semantic analysis, affinity clustering
PDF Full Text Request
Related items