Font Size: a A A

Combination Of Rule-based And Statistical-based Anaphora Resolution

Posted on:2010-02-03Degree:MasterType:Thesis
Country:ChinaCandidate:K J JinFull Text:PDF
GTID:2178360278457528Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of internet,it is hard to search the necessary information fast and accurately while the increasing information meets the requirements of people. In this case, more and more researchers begin to be interested in automatic summary.Extrative sumarization, for example, Event Semantic Relation based Multi-Document extractive Summarization (ESRMDS) is one method of existing automatic summary. The main idea of ESRMDS is extracting the event terms from document sets, making certain the semantic relation between event terms according to semantic resource and computing the importance of each event term, consequently gaining the weight of all sentences and ranking them, organizing the summary sentences according to the text at last. According to the definition of event term in the field of automatic summary, event term is the verb or gerund between two name entitiesThere are many pronouns to refer to the noun or gerund that have already been expressed in general documents. With regarding to the definition of event term, many event terms between pronouns or between pronouns and nouns will be ignored, so that the number of event terms and the performance of automatic summary will be decreased, so anaphora resolution become the key to improve performance of automatic summaryThis paper uses the method of combination of rule-based and statistical-based anaphora resolution in documents. Firstly, we only use rule-based to do anaphora resolution, the shortcoming of this method is that it can not confirm which pronouns indicate name entities according to analyze recall,precision and the output documents. So we put rule-based and Maximum Entropy together to resolute above questions and confirm which pronouns to replace precisely, improve precision and recall, increase the number of name entities, extract event terms as many as possible from input sets, consequently enhance the performance of automatic summary. The result of experiment indicates that this method makes the performance of summary improve 8.5% comparing with the method without anaphora resolution, On the other hand, the readability and the fluency of summary have improved.
Keywords/Search Tags:anaphora resolution, rule, maximum entropy, name entity, semantic relation, event, automatic summary
PDF Full Text Request
Related items