Font Size: a A A

Research On The Construction Of Hierarchical Structure Of Non-filled State Complex Sentences Based On Decision Tree

Posted on:2019-02-10Degree:MasterType:Thesis
Country:ChinaCandidate:Y K XiaoFull Text:PDF
GTID:2428330548467231Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Chinese is the most used language in the world.Linguists have never stopped studying it.From the initial processing of words,processing of words to sentences processing and textual processing,which indicates that the study of word and words has become more and more mature,what's more,it also shows that the focus of future research is slowly developing toward sentences and chapters.As one of the key links between single sentence and text,Chinese complex sentence has become an important research direction in the current Chinese information processing and it is also one of the research difficulties.This paper puts forward the decision tree algorithm in machine learning,starting from the relevance features between clauses,and using attributes as the basis to determine whether the two clauses belong to the same category,and converts the complex sentence internal analysis problem into complex sentence internal clause classification problem,to determine the hierarchical relationship within complex sentences.The paper analysis object is the non-filled state of the complex sentence,mainly uses a clause in a complex sentence,so the treatment of the complex sentence in the previous period is very important.The preprocessing process described in this paper includes:First,Segmentation of complex sentences to find the correct relational words in complex sentences.These relation words will be an important basis for the classification of complex sentences.Second,Processing clauses in complex sentences.On the one hand,it uses punctuation symbols to divide complex sentences and obtains the preliminary clauses in complex sentences.On the other hand,it uses existing rule bases to classify these clauses.Filter,remove false clauses,and get correct clauses.Third,make use of the relational words to carry on the preliminary level division to the complex sentence,will not determine the level of clauses and its before and after clauses to combine,forms the corresponding clause pair,and extracts the clauses to have the correlation characteristics between the clauses.After the preprocessing is completed,the collected clauses are used as the original training set,and a decision tree algorithm is used to form a corresponding decision tree model.Thus,the classification of the clauses in the complex sentence can be realized,thereby improving the accuracy of the division of the hierarchical structure of complex sentences.The experimental results show that by using the decision tree algorithm of machine learning and constructing the corresponding decision tree model,the automatic classification of the complex sentence hierarchy can be achieved.The accuracy rate reaches 83.7%.This also shows that the method has standard complex sentences for non-filled states.The hierarchical division is effective.
Keywords/Search Tags:Relationship Words, Related features, Decision tree, Non-filled state, Complex sentence hierarchy
PDF Full Text Request
Related items