Font Size: a A A

The Design And Implementation Of An Approach To Chinese Disambiguation Based On Classification Rule Discovery

Posted on:2007-03-13Degree:MasterType:Thesis
Country:ChinaCandidate:J L XiaFull Text:PDF
GTID:2178360185978145Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Chinese automatic word segmentation is the fundamental task of the Chinese Information Processing. It mainly comprises of three difficult questions, including word criterion, disambiguation, unknown word identifying. Many researchers have contributed to this field, but in the present days, it still needs pursuing higher precision.The paper is to settle the problem of disambiguation. By reviewing the literature of disambiguation, we conclude that the precision of segmentation is rest with evaluation function. But the existing evaluation functions are simply naive and lack of objectivity and completeness. So the paper collects the ambiguous data that can represent the ambiguous phrase state and language situation from training corpus, mines the rule from these data with classification rule discovery. These rules can embody the objectivity and completeness as the evaluation function, because the rule is trained from the training corpus and reflect the state of segmentation point.Finally, we apply the discovered rules as evaluation function to the Sighan Test Text. The experiment results are satisfied.
Keywords/Search Tags:Chinese automatic word segmentation, Disambiguation, Classification rule discovery
PDF Full Text Request
Related items