Font Size: a A A

An Automatic Recognition Method Of Chinese Relation Words In Compoundsentences Based On Dependency Tree Similarity

Posted on:2016-02-07Degree:MasterType:Thesis
Country:ChinaCandidate:C LiFull Text:PDF
GTID:2308330464472039Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Currently, More and more researchers research on compound sentences. Because relation words are important structural relationships and semantic relationships’ symbol of compound sentences, so researching on relation words is the key point of studying compound sentences, it is also the basis of researching on hierarchy division and semantic structure of compound sentences. Therefore, accurate and automatic recognize relation words is very important to computer understanding chapter.In the conventional methods of automatic recognizing relation words, most of them are based on the rules of rules library and constraints. Aiming at the deficiency of Rule-based automatic recognizing relation words, this article uses the dependency tree which contains dependencies and syntactic structure of compound sentences, combines the characteristics which is relation words and its collocation of compound sentences is a frozen stucture, presents an automatic identification method what is based on the similarity between dependency treebank and compound sentences.Firstly, this paper brief introduce dependency treebank building and how do we sample and analyze corpus to get relation words dependency treebanks which is similarity reference set. Then, this paper use experimental verification method’s rationality and feasibility. relation words automatic recognition system use relation words collocation library extracting quasi-relation words and building quasi-relation words dependency tree, The quasi-relation words tree respectively calculate the content similarity and syntactic structure similarity, which are the standard of automatic identification relation words. Finally, this paper analyze the results of test, give the conclusions and propound recommendation to optimizing accuracy of the analyzer.All compound sentences used in this paper are from CCCS. Experiment contain two group, one divided by features of compound sentences, the other one not divided by characteristics of compound sentences. The two experiments’ average content similarity, respectively:91.4% and 90.8, average structural similarity, respectively:80.7% and 79.1%. Relation words recognition accuracy rates, respectively:91.7% and 91.9%,which are 6point higher than the rule-based method. The results show that compound sentences have frozen structure, and the method is effective.
Keywords/Search Tags:relation words, treebank, similarity, automatic recognition
PDF Full Text Request
Related items