Font Size: a A A

Research On Application Of Language Model Adaptation For Embedded Systems

Posted on:2007-04-20Degree:MasterType:Thesis
Country:ChinaCandidate:Q LiangFull Text:PDF
GTID:2178360212485368Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the wide use of embedded systems, the need for an efficient way to input Chinese characters becomes larger and larger. Applying Chinese full-sentence input methods based on a statistical language model (already mature on PCs) is a valuable research task. Based on others'research, the work in this thesis is as follows.At first, in order to address a spoken language environment, a language-style-based adaptation method for language models is studied, and an adaptation method based on the classification of a trigram's style feature is proposed, based on the differences between spoken and written languages. In this method, weights are dynamically calculated according to the trigram's language style tendency, and several weight generation functions are proposed. The proposed method can adapt a written language model to a spoken language model, so the accuracy of pinyin-to-character conversion in a spoken language environment can be improved.Second, in order to address personal input needs for embedded systems, online adaptation for compressed language models is studied, and an online adaptation method for compressed language models by learning from error-correction is proposed, based on real-time feedback from the system. In this method, a kind of user-supervised, example-based learning is used to achieve the goal of error correction; considering the demand of time complexity in embedded systems, the rank or count index of an n-gram in the embedded language model is modified gradually. The proposed method can adapt a common compressed language model to a user-dependent compressed language model, so the accuracy of pinyin-to-character conversion for a certain user of an embedded system can be improved gradually in practice.In Chinese full-sentence input systems related to this paper, a language style based adaptation is used to adapt the original language model used on a PC to a common language model which is more effective in spoken language environments. Then, a compressed language model useful for embedded systems is acquired by compressing the common language model with others'method. When a certain useruses the input system, online adaptation for the compressed language model is used to continuously improve the Chinese input efficiency for this user. These methods help achieve the goal of allowing embedded system users to input Chinese more conveniently and quickly, and do so with good performance.
Keywords/Search Tags:language model, embedded system, language style based adaptation, online adaptation, learning from error-correction
PDF Full Text Request
Related items