Font Size: a A A

Research On Event-Oriented Text Representation And Applications

Posted on:2015-02-27Degree:DoctorType:Dissertation
Country:ChinaCandidate:T LiaoFull Text:PDF
GTID:1228330434959459Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The humans understand and remember the world through “event”. The eventreflects the movement, behavior and change in the real world; the real world iscomposed of numerous interrelated events. The event as the basic unit of humanknowledge perfectly coincides with the way of human thinking. The study of theevent has been concerned by the field of cognitive science, linguistics and artificialintelligence etc. Especially in recent years, the event has become a hot spot in thefield of natural language processing.Although studies of event are mostly focused on some event-orientedapplication technologies in the field of natural language processing at present,event-oriented text representation is the basis of these application technologies.People need to research the event-oriented method for text representation to supportvarious event-oriented information processing. This paper studied the languageperformance and completion of the event elements, and mined the relationshipsbetween events. On the basis of the thought of event network text representationproposed by the project group, this paper considered the event as a basic semanticunit for narrative texts and studied the event-oriented text representationmethod——event network, and explored the construction theory and method ofevent network and its application. The general research content and innovation spotof this paper are listed as follows:(1) The language performance and completion of the event elements. In basedon the CEC corpus, this paper deeply analyzed event examples of the annotated texts.Firstly, we extracted event denoters from the corpus as event features and used thesemi-automatic approach to cluster these event features by HIT IR-Lab TongyiciCilin, and got a similar event denoter table. Then we studied the languageperformance of event time, environment and object elements to find the languageperformance rule of each event element. Finally, we analyzed the default phenomenon of event elements in the annotated texts and defined some heuristicrules based on the context structure and semantic relationship to implement thedefault judgment and completion of the event elements, and our experiment hasachieved the ideal effect.(2) Mining the relationships between events based on event Co-occurrence. Byarranging and analyzing the relationships between events, we can find the semanticrelationship between event classes from the texts. On the basis of co-occurrencetheoretical research, we firstly analyzed the event co-occurrence phenomenon. Thispaper used5event topic class text sets of the CEC corpus and respectively utilizedthe sentence, paragraph and text as a window unit to build event co-occurrencenetworks. Then we made the extraction process of event co-occurrence pairs as aprocess of extracting event fixed semantic relation rules and used the association rulemining method to extract event co-occurrence pairs from different eventco-occurrence networks, and got the semantic relationships between event classesafter generalizing and analyzing these event co-occurrence pairs. Finally, we studiedimportant events extraction based on the event co-occurrence network, and verifiedits effectiveness through experiment.(3) Research on event-oriented method for text representation. By studyingsome traditional text representation models, this paper developed the event networkmodel on the basis of the thought of event network text representation proposed bythe project group. The event network is the graph structure model, the nodesrepresent events and the edges between nodes represent the relationships betweenevents. We can construct different event networks through selecting different eventrelationships. Here, this paper studied two types of event networks: the undirectedevent network and directed event network, and we gave the related definitions andconstruction methods. The undirected event network is built by using theneighboring relationship in the paragraph or the event similarity relationship. Theundirected event network can not only contain the information of event features, but it can contain the text structure information and similar information between events.The directed event network is built by using the adjacent relationship in the sentenceor the non-classified semantic relationship between events. We can intuitively realizethe occurrence and development of events in the text through the directed eventnetwork, and can better understand the text semantic knowledge.(4) Research on the algorithms based on event network and applications. Fortwo types of event networks, this paper studied the related algorithms whichcombined with two specific information processing applications (automaticsummarization and text classification), and verified the effectiveness of the eventnetwork model through experiment. We proposed a sub-event topic communitydetection algorithm to get the sub-event topic relative degree of each event, andadded it to the event weight to calculate the importance degree of events. On thisbasis, we realized text automatic summarization based on undirected event network.According to the characteristics of the directed event network, we presented themaximum common subgraph matching algorithm to calculate the similarity degreebetween event networks, and applied it to text classification. The experimentalresults showed that our automatic summarization and text classification based onevent network have good performance.
Keywords/Search Tags:Text Representation, Event Co-occurrence, Event Network, Sub-EventTopic Detection, Maximum Common Subgraph Matching
PDF Full Text Request
Related items