Font Size: a A A

Open-domain Event Extraction And Microblog’s Event Detectionand Tracking

Posted on:2014-09-08Degree:MasterType:Thesis
Country:ChinaCandidate:J J ZhaoFull Text:PDF
GTID:2268330422950581Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the era of web2.0, web data grows explosively, which let the people drown inthe ocean of data. Therefore how to deal with the big data, and how to provide the datafor people efficiently become a serious problem. Open-domain event extraction isproposed in that background.In this paper, Open-domain event extraction is different from the traditional eventextraction. In Open-domain event extraction, the core is trigger extraction for anydomain, and also includes event arguments such as time, location, people and so on. Theresearch of this paper is mainly on the free text to extract event. As well asMicroblogging has become a platform of information sharing, extracting events onMicroblogging is Significant, so this paper also research the Microblogging’s eventdetection, tracking, expressed.For Open-domain event extraction, this paper divides the task into two Subtasks:Open-domain event trigger extraction and event arguments extraction. For the triggerextraction, this paper gives two methods: rule-based approach and CRF model approach.Rule-based approach requires people to construct rule, it has a high speed of extraction,and has a strong performance. But it is incomplete, and over-reliance on syntax. CRFmodel has a high accuracy, affects little by syntax, but it is poor for complex sentences.So we combine the two methods depend on the above analysis, and do someexperiments to prove its effectiveness. For arguments extraction, this paper first use theME model, it is a simple method, but don’t take into the relationship between candidatearguments. It can’t solve well in a single sentence which has multiple events. So thispaper gives Hypergraph-Partitioning method to extract event arguments. TheHypergraph-Partitioning method can fuse language features between triggers andarguments into hypergraph, then use some partitioning method to extract arguments. Ithas a good performance for a single sentence which has multiple events.For Microblog’s data, this paper gives a complete framework which includes eventdetection, tracking, showing. According to the characteristics of Microblog’s data, weconsider the time evolution of event seriously. In the process of event tracking, thispaper presents a novel perspective of graph theory, using bigraph matching algorithm totrack events. For event showing, this paper takes the view of sociology, bring in thefactors of influence to perfect its representation.Open-domain event extraction can help users extract useful information form web,it can also provide support for higher NLP processes, such as QA, knowledgeengineering. So the work has an important research significance for application andindustry.
Keywords/Search Tags:Open-domain, event extraction, Hypergraph partitioning, event detection, event tracking
PDF Full Text Request
Related items