Font Size: a A A

Research On Chinese Event Extraction Technology

Posted on:2008-01-17Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y ZhaoFull Text:PDF
GTID:2178360245998162Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Event extraction is a very important research point in the area of information extraction. Event extraction can present the event which was describes by natural language through structural form, e.g. who, where, when and what is related to the event. And this technology can be widely applied to many NLP researches, such as summarization, question and answering, information retrieval and so on. This paper makes an intensive study of the two stages of Chinese event extraction: event type recognition and event argument recognition, and then develops a practical event extraction system named HIT-IR EES.In the research of event type recognition, for the reason of small amount of event instances limited by the corpus, the data sparseness problem brought by the small set of training data is the main point of this stage. In this paper, we present a novel method based on automatically extending event triggers to solve this problem in which we first extend event triggers via thesaurus, and extract the candidate events and their candidate types by using extended triggers, and then we adopt a binary classification method to recognize the type of the candidate events. This method solves the data unbalanced problem in training model and the data sparseness problem brought by the small set of training data effectively. Moreover, the precision and the recall of event type recognition are improved. Evaluation on the datasets of ACE2005 shows that, the final F-score achieves 61.24% which outperforms the traditional methods based on machine learning significantly.In the research of event argument recognition, how to recognize the right argument from lots of entity, time expression and value is the main point. In this paper, we present two methods to slove this problem: SRL(Semantic Relation Labeling) based and ME(Maximum Entropy) based. SRL based method sloves the argument recognition problem from application, which tallies with the task of SRL. But for the reason of an over reliance on the bottom techniques, such as SRL and syntax parser, error cascade becomes a big problem; ME based method considers the event argument recognition as a classification problem and considers all the entity, time expression, value which appeares as candidate argument, and then describes these candidate arguments by using lexical, type, context and syntax features from different angles. Finally we adopt a multi classification method to recognize the role of the candidate arguments. The experimental results show that, the multi classification method achieved a better result due to the plenty of candidate argument instances. Evaluation on the datasets of ACE2005 shows that, the final F-score achieves 64.64%.
Keywords/Search Tags:Event Extraction, Event Type Recognition, Event Argument Recognition, Maximum Entropy
PDF Full Text Request
Related items