Font Size: a A A

For The Field Of Sports News Of The Chinese Simple Noun Phrase Coreference Resolution

Posted on:2010-03-24Degree:MasterType:Thesis
Country:ChinaCandidate:R Y ChenFull Text:PDF
GTID:2208360275498499Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Coreference is a ubiquitous natural language phenomenon in discourse, which makes an expression more concise and sentence more coherent. And yet it brings difficulty to computer in understanding natural language. Coreference Resolution is the process of determining whether two expressions in natural language refer to the same entity in the word. This dissertation researches on Chinese Coreference Resolution and focuses on the impact of different features introduced in Coreference Resolution based on decision tree. The major works are as follows:1. For the characteristics of noun phrase in sport news domain, we define simple noun phrase and ultilize rules based on POS to recognize the noun phrase. Determine the standard of marking of testing data, and then design marking software of training instances on Coreference Resolution.2. The length feature based on sentence and length feature based on clause are analyzed and compared in improving performance of Chinese Coreference Resolution. When combined with Coreference Resolution System respectively, the test shows that the length feature based on clause is better than the length feature based on sentence, which improves the accuracy of resolution by 4.08% compared to the latter.3. We propose a Semantic Class (SC) feature based on hyponymy introduced in Chinese Coreference resolution for studying English SC in Coreference Resolution. We define three shallow semantics, including Human, Place and Organization, which is used to classify the class of Noun Phrase. Experiments on the test show that a resolver that employs such SC knowledge yields a statistically significant improvement of 8.54% in F-measure, which is higher than inducing other features.4. Investigating in characteristics and classes of Chinese Alias, we propose rule-based Chinese Alias feature introduced in Coreference Resolution, which referred to content of Noun Phrase and then classify the alias roughly.5. We complete to construct the test platform of Chinese Coreference Resolution based on decision tree algorithm, oriented to the field of sport news.
Keywords/Search Tags:Coreference Resolution, Statistics Learning Method, Decision Tree, Semantic Class Feature, Alias Feature
PDF Full Text Request
Related items