Font Size: a A A

Research On Web Text Event Extraction Concerning Specific Citizen Groups

Posted on:2015-03-05Degree:MasterType:Thesis
Country:ChinaCandidate:L QiaoFull Text:PDF
GTID:2298330422487400Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the information technology and the World WideWeb, huge amount of electronic data have been grew yet keep on boostingexponentially. Given the circumstance of information explosion, informationextraction as a necessary tool which transforms unstructured text into structured orsemi-structured records has become an unprecedented important issue under research.Event detection and recognition, also known as event extraction, is afundamental task defined by ACE (Automatic Content Extraction) evaluationconference, and has already become the key technology of information extraction andautomatic text summarization. Event extraction, focusing on structured eventattributes extraction from the natural language, can provide preferable dataenvironment for further data processing and application. As a formal and explicitspecification of shared conceptualisation, ontology plays an increasingly importantrole in the application of artificial intelligence, such as information processing andnatural language understanding.This thesis focuses on the extraction model research and algorithm design forweb event extraction concerning specific citizen groups on profession related fields.Specifically, by choosing civil servants as the specific group, this paper researchesand solves the problems of event type recognition, ontology construction and elementextraction relating to the specific citizen group. The models and methods presented inthis paper will enrich and develop intelligent information extraction for massiveamounts of network data by analyzing and concluding the regular pattern, model andmethod of web event for generalized people groups.The main research works are as follows:1. This thesis presents a semi-automatic event ontology model for civil servantsrelated event recognition through theoretical analysis and experimental valuationbased on large amount of network text corpus. The event trigger word extractionalgorithm and event trigger clustering algorithm were adopted into the model.2. In the research of event type recognition, this thesis proposed an algorithmwhich discriminates event types by calculating similarity between the event triggerword and the ontology concept. Besides, the performance of the method was alsoanalyzed by comparison with the maximum entropy binary classification method.3. This thesis presents an event elements extracting model based on event ontology. This method extracts event elements by using event template and SRLsemantic role labeling. The time elements was further adjusted according to thepublish time. Results of experiments on civil servants related corpus data shown thatour method was efficient and favorable on its F-score in event recognition and eventelement recognition.The work in the thesis can be generalized for web text event analysis andextraction concerning any specific groups of people or subjects which have the topicattributes in common and can play an important role in constructing public sentimentanalysis and other related systems.
Keywords/Search Tags:Event Extraction, Specific Citizen Groups, Event Element Recognition, Ontology
PDF Full Text Request
Related items