Font Size: a A A

Overall / High-frequency Principle Of Priority,

Posted on:1997-12-13Degree:MasterType:Thesis
Country:ChinaCandidate:J DengFull Text:PDF
GTID:2208360185495470Subject:Computer science theory
Abstract/Summary:PDF Full Text Request
The Pinyin Stream conversion is a new approach to Chinese character input. In Chinese, one pinyin usually corresponds to several characters. The constraint of context is the only way to eliminate the unwanted interpretations.Based on wholistic-probabilistic priority principle, a scoring technology is used in the analysis of conversion. Based on our view, neither the standard of word segmentation nor syntatic analysis has a simple yes-or-no answer, but has some "intensity" to be valid. A certain scoring system is used to evaluate the intensity of rule and the results of conversion. The wholistic-probabilistic priority principle underlying our scoring system is as follows: in word segmentation, longer and higher frequency word will get higher score; and in syntatic analysis, the result that deduced by "stronger" rule and that with higher segmentation score will get higher overall score.Our conversion system is divided into two steps. The first is word segmentation, the other is syntatic analysis. The word segmentation method based on the concept of "semi-word". It reachs high accuracy. Earley's parsing algorithm extended to a more generalited case with the propagation of scores, which improves the effect in elimilating the unwanted interpretations.This paper has five chapters. The first chapter elaborates the idear of wholistic-probabilistic priority principle. The second chapter introduces the present condiction of pinyin stream conversion and the author's work. The third chapter describes the algorithm in word segmentation and syntatic analysis. Chapter 4 briefly discribes the structure of our conversion system. Furthermore, it is proposed some suggestions on the improvement of pinyin stream conversion. Chapter 5 summarizes this paper.
Keywords/Search Tags:pinyin stream conversion, wholeistic-probabilistic priority, scoring system
PDF Full Text Request
Related items