Font Size: a A A

Research And Implementation Of Semantic Triples Construction Based On Dependency Parsing

Posted on:2015-02-13Degree:MasterType:Thesis
Country:ChinaCandidate:Y E LiFull Text:PDF
GTID:2268330431452408Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet, there is an explosive growth trend ofknowledge and information, but the intelligent degree of search engine cannot meet theactual needs of people. Therefore, W3C proposed a new and intelligent Semantic Web. ForChinese, the main task of the Semantic Web structure is to extract sentence’s semantictriples composition. Therefore this paper focuses on the theory of syntactic analysis and itsrelated methods in the natural language processing. Dependency parsing is used toestablish semantic triples of Chinese complex long sentences, that is subject, predicate andobject. The extraction of semantic triples lays the foundation for the construction ofSemantic Web automatically.The word order of Chinese long sentence is flexible and its dependencies are complex.As a result, this paper uses the Root Searcher to divide a long sentence into two shortsentences, then each of short sentences are processed for dependency parsing. HITDependency Treebank which has more long sentences is selected as an experimentaltraining and testing corpus. First, the HIT Dependency Treebank is used DOM4j of JAVAto convert from XML format to TXT format. Then, use support vector machine method totrain the node words and predict the root node. This paper uses the LIBSVM as a binaryclassifier to construct a Root Searcher, and exact the features realated to the root node toanalyze the comparative experiments. As a result, the optimal features combinationsrelated to the performance of Root Searcher. Finally, to avoid the greedy problem ofArc-eager algorithm for long distance dependency parsing, Arc-eager algorithm andsupport vector machine are combined to parse the dependency of short sentences. Exact thesemantic triples and do the comparative experiment to analyze the result. In this paper,1000long sentences and1981divided short sentences are chosen as a comparativeexperiment for analyzing the precision of dependency.First, this paper constructs the Root Searcher and divides a long sentence into twoshort sentences; then, it parses the dependency of short sentences. Theoretical analysis and experimental results show that with this method the obtained the precision of root node,subject-predicate relationship and verb-object relationship are higher than the original longsentences.
Keywords/Search Tags:Root Searcher, dependency parsing, support vector machine, semantictriples
PDF Full Text Request
Related items