Font Size: a A A

Alignment For Ancient-Modern Chinese Bi-text

Posted on:2008-11-14Degree:MasterType:Thesis
Country:ChinaCandidate:Z LinFull Text:PDF
GTID:2178360215483612Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
Bi-text alignment is useful to machine translation, and many other applications such as bilingual lexicography, word sense disambiguation in Natural Language Processing. Research on bi-text alignment has been carried on several different language pairs, such as English and French, English and Chinese. But there is no research on Ancient-Modern Chinese bi-text. This paper will describe a method for aligning Ancient-Modern Chinese bi-text at the level of sentence and clause.In sentence alignment for Ancient-Modern Chinese bi-text, after describing some characteristics in Ancient-Modern Chinese bi-texts, we combine length information, alignment mode information and Hanzi character information to find the overall least cost in aligning by using dynamic programming. The sentence pairs can be aligned with accuracy of 92% in our experiment. We analyze some cases which are prone to mismatch.Since clause alignment is more informative for further application, this paper furthermore implements a clause alignment task. In clause alignment for Ancient-Modern Chinese bi-text, we use the method which is similar with that in sentence alignment. There are 93% clause pairs are aligned correctly in our experiment, furthermore, we discuss the influences brought by different statistical information sources.
Keywords/Search Tags:sentence alignment, clause alignment, statistical information, Hanzi information, dynamic programming
PDF Full Text Request
Related items