Font Size: a A A

Error-Driven Chinese Part-of-Speech Annotaion Rearch

Posted on:2008-01-07Degree:MasterType:Thesis
Country:ChinaCandidate:Y WangFull Text:PDF
GTID:2178360215983607Subject:Control theory and control engineering
Abstract/Summary:PDF Full Text Request
In the recent years, with the rapid development and enlargement of the Chinese Corpus and annotation technologies, a large scale of language block based at nationality language and different types of tagging feature musters appeared. The researches of the deep-processing methods and relevant algorithms are in need for the advancement of Nature Language Processing. Just like the other language, the first step to approach Chinese corpus knowledge is part-of-speech tagging. Annotation systems which can run on the computers supports the computational linguistics which have attracted wide concerns from the related fields such as Artificial Intellegence.There are several annotating solutions which mostly base statistical algorithm and rules which was writted manually. Such as the Maxent Entropy model and Hidden Markov ModelRule, which integrated different rules-templates can provide tagging tools for Natual Laguage. But the tagging results are not good enough to apply to the deep level annotation in the real text.According to the statiscal examples which are collected from multiwords annotation error results in system, this essay will introduce three parts of appending models for Part-of-Speech task based at Maxent Entropy model. A new error-based method composed of events with feature probability which was calculated in advanced was held out to choose features templates for multi-word.
Keywords/Search Tags:error-driven, part-of-speech, annotation, maxent entropy
PDF Full Text Request
Related items