Font Size: a A A

The Research And Implementation Of Computability On The Chinese Segmentation Specification

Posted on:2007-01-09Degree:MasterType:Thesis
Country:ChinaCandidate:S XuFull Text:PDF
GTID:2178360185978458Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Chinese automatic segmentation is a fundamental hard problem in Chinese information processing (CIP). It is one of the bottlenecks of Chinese Information Processing development. And segmentation standard is the principal problem in Chinese automatic segmentation. Although many researchers have studied in the elimination of the segmentation ambiguity and Out-of-vocabulary, the current research has a little bit underestimated the importance of the segmentation specification and it baffled the development of CIP.This paper analyzes the scientificalness of the National Chinese Language Word Segmentation Specification for Information Processing. It analyzes the specification's completeness and consistency, and points out the related flaw. Then this paper implements the computable rules detailedly by analyzes the specification's rules deeply. And we validate the inconsistent segmented result in the larger scale Chinese corpus and mining some experience rules from them by the part of speech. We also check them manually. Finally we segment the corpus using the computable rules and analyzes the result. At the analytical stage, this paper have made an all-around comparison between system using rules and without using rules. In the end the authors summarize the importance of application of the segmentation specification.
Keywords/Search Tags:automatic segmentation, Chinese segmentation specification, scientificalness, computability
PDF Full Text Request
Related items