Font Size: a A A

System On Generating Event Chain In Chinese Texts

Posted on:2016-06-13Degree:MasterType:Thesis
Country:ChinaCandidate:R WangFull Text:PDF
GTID:2308330482451145Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Event Chain generation is helpful to understand the hierarchical structure and frame of a discourse. It’s fundamental for deep understanding of a discourse. Event Chain, a linear expansion of event information, is closely related to the lexical chain. Current event chains are mostly linear structure, without distinguishing primary and secondary events. They don’t grasp the article focus very clearly. This paper proposed a new method based on the topic segmentation to build the event chain through the lexical chain, which distinguished primary and secondary events. Our approach showed the hierarchical structure, and described the basic framework of a discourse more accurately. The paper’s main contents are:(1) Exploiting support vector machine (SVM) to find effective features for topic segmentation. According to the characteristics of segmented topic, by extracting the boundary features when topic transfers and the features that sentences expressing the same topic are similar, we split a topic based on SVM.(2) For lexical chain construction, we used the method based on HowNet to calculate vocabulary similarity. Extract the nouns of an article, build the candidate set, choose the lexical chain which is the most relevant to the word in the candidate set, and add the word to the lexical chain.(3) Combining the segmented topic with lexical chain to generate event chain. We need to find the strongest chain from the constructed lexical chains. First, find the event trigger word for the strongest lexical chain and build initial event chain. Second, judge the primary and secondary events according to the segmented topic and build event chain which contains primary and secondary information.Our Event Chain automatic generation system, which based on Java language, was divided into three modules:topic segmentation, lexical chain generation and event chain generation. By testing on Treebank news corpus, the precision of event chain automatic generation is 65.39%, and the recall rate is 67.59%. The results show that generated event chain by constructing lexical chain can effectively express primary and secondary information of discourse.
Keywords/Search Tags:Topic segmentation, Lexical chain, Event chain
PDF Full Text Request
Related items