Font Size: a A A

Research On Event-oriented Knowledge Processing

Posted on:2011-07-15Degree:DoctorType:Dissertation
Country:ChinaCandidate:J F FuFull Text:PDF
GTID:1118360308476473Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Taking"Event"as a basic unit of knowledge representation and an important means for information organization has received increasing attention. The study of event-oriented knowledge can provide services for information processing technologies, such as Automatic Summarization and Question Answering System. This paper focuses on the following four aspects: the construction of event-oriented Chinese corpus, event recognition, event argument recognition, and event causal relation extraction. For the shortcomings of these studies, some practical solutions are presented, which include:1. Corpus construction is a fundamental task of natural language processing technology. For different studying purposes and objects, different annotation systems are employed in the existing event-oriented corpora. These annotation systems mainly focus on certain types of events or event arguments, but ignore the general events and people's understanding and awareness for event. In this paper, a questionnaire based on event is designed, the common sense of event in text is analyzed from the questionnaire, the taggability of Chinese event is explored, and a method for building Chinese event corpus is presented. This method is not limited to certain types of events; all the events which mentioned in text are involved in it. In addition, the method is suitable for Chinese because it is based on syntactic analysis and semantic analysis of Chinese sentence. Evaluation results show that this method obtains a high annotation agreement. Further more, we have developed an annotation tool, collected 200 reported articles about emergencies as raw corpus and annotated it to build a Chinese event corpus (CEC). Nearly ten research members have taken part in the annotation job for 10 months. Comparing with the ACE and the TimeBank corpus, the CEC corpus is the smallest, but the annotated events and event arguments are the most comprehensive. 2. Event recognition is the basis for the event extraction task. Most of the current approaches for event recognition employ machine learning methods, which need to explore effective features to improve the systems performance. This paper presents an event recognition method based on multi-features combination. While construct a feature vector, the context features, part of speech features, grammatical features and semantic features are all combined in it. The experiments with two different classifiers and analysis for the distinguishability of these features are carried out. The experimental results show that the performance improved obviously with the addition of effective features, and the system achieves the best performance while combining multi-features. Comparing with tf×idf based event recognition method, our method obtains better performance.3. The approach of event argument recognition based on supervised (classification) learning needs large-scale annotation corpus as training set to obtain the knowledge of event argument. This approach highly relies on the corpus, and it would get a poor system performance if the corpus is sparse. This paper presents a method for event argument recognition based on semi-supervised clustering and feature weighting, which can reduce the dependence on the corpus. In this method, a few labeled data is taken as seed set to guide the clustering analysis. Different weights are assigned to different features according to their importance of contribution on clustering. In addition, the traditional semi-supervised clustering algorithm (Constrained-KMeans) and feature weighting algorithm (ReliefF) are improved to apply to the task of event argument identification. Experimental results show that our method achieves good performance while the labeled data is insufficient.4. Event causal relation is an important semantic relation. Event causal relation extraction has a broad prospect of application. Traditional methods for event causal relation extraction are limited to marked,inner-sentence and"one cause, one effect"relation. In fact, there are also a large number of unmarked, outer-sentence/outer-paragraph,"one cause, many effects","many causes, one effect"and"many causes, many effects"causal relations in text. This paper presents a method for event causal relation extraction based on cascaded Conditional Random Fields (CRFs). The method casts the problem of event causal relation extraction as event sequence labeling and employs dual-layer CRFs model to label the causal relation of event sequence. The first layer of the CRFs model is used to label the semantic role of causal relation of the events, and then the outputs of the first layer are passed to the second layer for labeling the boundaries of the event causal relation. The corpus analysis and experimental results show that our method not only covers each class of event causal relation (including: marked/unmarked, inner-sentence/outer-sentence/outer-paragraph,"one cause, one effect","one cause, many effects","many causes, one effect","many causes, many effects") in text, but also achieves good performance.
Keywords/Search Tags:Event, Chinese Event Corpus, Event Recognition, Event Argument Recognition, Causal Relation Extraction
PDF Full Text Request
Related items