Font Size: a A A

Chinese Discourse Relation Recognition And The Application Research

Posted on:2014-08-10Degree:MasterType:Thesis
Country:ChinaCandidate:Y SongFull Text:PDF
GTID:2268330422450610Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet and the underlying informationextraction technology, search engines and other Internet applications put forwardhigh requirements to text analysis. They just want to understand a sentence, but alsohope to understand a document. Document semantic analysis has been attachedmore importance. Discourse relation plays a key role in document analysis, naturelanguage process and information retrieval. It’s obvious that the cause relation indiscourse is of great help to the question and answer system. Discourse relationmeans the semantic association of two text spans in a document.Discourse relation research in English has drawn more and more attentionrecent years. It is mainly own to the Penn Discourse Treebank. However, there israrely any Chinese research in this field. One important reason is there is no largescale Chinese relation corpus.In our paper, we focus on Chinese discourse relation research comprehensive.First, we build a Chinese discourse coups consist of1096texts. In order to miningthe expression characteristics of Chinese discourse, we analyze the data based onthe corpus we build, such as the semantic ambiguity between two relations.Discourse relation types break down into two major categories, explicit relation andimplicit relation.Explicit relation is a discourse relation to hold between two text spans withoutan explicit discourse connective. The connective can represent the discourse relationbased on the corpus analysis. So we recognize the explicit discourse relation usingconnective, experimental result show this method performs well. The F-score ofcondition relation can reach94.49%, other relations also perform well.Automatic sense prediction for implicit relation is an outstanding challengerelative to explicit relations in discourse processing. So we recognize the implicitrelation based on machine leering method. Use the maximum entropy model andSVM model for modeling respectively, and extract the corresponding characteristics.Experimental result show SVM model achieved better results, the F-score of expandrelation can reach72.36%. At the same time, we analyzed features used in theexperiment, the cue word feature performs best, and it plays a significant role invarious relations. As the quantity limit to supervised coups, we extract amount ofimplicit causally instance by remove the connective from explicit instance, then addthis part coups to the training corpus, this method can improve the F-score by8.3%.Finally, this paper examines the relationship between discourse relation andevent relation. Based on causal event relation, we found that it’s more effective if add the discourse relation feature than the traditional method for event relationextraction.
Keywords/Search Tags:Discourse relation, Explicit relation, Implicit relation, Corpus, Event relation
PDF Full Text Request
Related items