Hierarchy Division Of A Compound Sentences With Non-saturated Relation Word Via Neural Network

Posted on:2020-07-20

Degree:Master

Type:Thesis

Country:China

Candidate:L L Yang

Full Text:PDF

GTID:2518305762978949

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

Compound sentence is an important entity of Chinese grammar,which contains two or more clauses.Among these relationships of clauses,the hierarchical structure and logical semantics are relatively complex.Correct classification of the hierarchical relations of Chinese compound sentences is not only of great significance to automatic question answering and machine translation,but also conducive to the development of text comprehension.Relation mark is a word in a compound sentence,which is used to connect clauses and indicate their relations.Because of the partial default of relation marks,the hierarchical structure and logical semantic relation of the compound sentences can not be explicitly identified,which makes it difficult to divide the hierarchical relation of the non-saturated compound sentences with relative words.This paper describes a study of non-saturated compound sentences with three clauses,using the method of deep learning to automatically identify the hierarchical attribution of the non-saturated compound sentences with relative words.The work done is as follows:First of all,this paper uses punctuation and dependency syntax to make a preliminary division of the clauses of compound sentences,and then uses the constructed "independent language" rule base to filter the pseudo clauses and realize the accurate division of the clauses of compound sentences.Secondly,the feature extraction of clauses in compound sentences is carried out in three aspects.First,we construct the syntactic analysis tree of clauses and go through it with the Depth-First-Search algorithm to extract the syntactic components of clauses and calculate the syntactic similarity among clauses.Afterwards,we extract the core argument of clauses and the word vectors of the core argument from the trained word vector model.Then the semantic similarity among clauses can be calculated.Lastly,the subject extractor is designed to extract the subjects of clauses.Then these subjects are judged to be the same as each other or not and the subject similarity among clauses is calculated.Thirdly,the paper constructs the hierarchical division model of the non-saturated compound sentences with relative words.This paper trains the hierarchical classification model of the non-saturated compound sentences with relative words based on the extracted characteristic data.Through the analysis of the characteristic data set,it is found that the division of the hierarchical relationship of compound sentences is closely related to the semantic similarity among clauses.Therefore,the weight of semantic similarity is increased in the process of training the hierarchical division model of compound sentences,so as to further improve the accuracy of the hierarchical division model of non-saturated compound sentences with relative words.Finally,the proposed method in this paper is verified by following method.We select 10,000 compound sentences from CCCS corpus to test the hierarchical classification model of non-saturated compound sentences with relative words,with an accuracy rate of 74%.At the same time,this paper selects the random forest,support vector machine and neural network for test in the same training set and test set and then evaluates these three models from several aspects,such as,accuracy,recall rate,precision rate,Roc curve and Auc.We found the hierarchical classification model of non-saturated compound sentences with relative words based on neural network gives better results,which proves the effectiveness of this method.

Keywords/Search Tags:

Non-saturated compound sentences with relative words, Hierarchical recognition, Syntactic features, Semantic features, Neural network

PDF Full Text Request

Related items

1	Relation Recognition Of Non-saturated Chinese Compound Sentences With Two Clauses Based On Deep Learning
2	Analysis And Research On The Hierarchical Structure Of Compound Sentences Based On The Concealing Or Revealing Rules Of Relation Marker And Associated Features
3	Automatic Recognition Of Relation Category Of Non-saturated Compound Sentences With Two Clauses
4	Hybrid Sentence Similarity Research Based On Semantic
5	Research On The Methods Of Relation Words Automatic Identification In Chinese Compound Sentences Based On Collocation Strength
6	Analysis Of Hierarchical Structure In The Marked Compound Sentences Based On Collocation Of Relation Words
7	Research On The Recognition Model Of Network Buzzwords Based On Compound Features
8	Automatic Selecting The Syntactic Features Of The Compound Sentence Based On The Dependency Tree
9	Research Of Rule Parser In Relation Words Of Compound Sentences Automatic Identification System
10	Research On 3D Human Pose Recognition Algorithm Based On Semantic Features