Font Size: a A A

The Application Research Of Hierarchical Structure Analysis Of Chinese Complex Sentences Based On Deep Learning

Posted on:2020-03-11Degree:MasterType:Thesis
Country:ChinaCandidate:M LiFull Text:PDF
GTID:2428330578952893Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Chinese information processing,an important branch of natural language processing,plays a prominent role in many aspects such as semantic understanding and text generation.As an important entity unit of Chinese grammar,complex sentences are complex and diverse in semantic expression.Therefore,it has high research value and significance.On the one hand,it is composed of single sentences,which can express abundant semantic information;on the other hand,it also contains the logical semantic hierarchical relationship between single sentences,which is of great significance for paragraph analysis.At present,the study of complex sentences mainly includes clause division,relational word recognition,relational category judgment,and hierarchical structure analysis.In view of the direct influence of relational words on the recognition of the hierarchical structure of complex sentences,it is very important to recognize relational words effectively and accurately.However,due to the different segregation degree of relational words themselves and the existence of a large number of unmarked or unmarked complex sentences,the extraction accuracy does not exceed 76.3%[1].Therefore,it is necessary to comprehensively analyze the hierarchical structure of complex sentences from the perspectives of syntax,semantics,and cross-cutting features.This paper mainly focuses on the analysis and recognition of the hierarchical structure of complex sentences based on the improved convolutional neural network method and the strategy of multi-dimensional feature fusion.The research work mainly includes three parts.Firstly,the accurate clause division of complex sentences is carried out.By analyzing the dependency syntax of complex sentences,the grammatical features based on the core of predicates are extracted,including the distribution of dependency between sentences and the distribution of dependency between sentences.Then,considering the degree of semantic association between clauses is an important factor in determining the hierarchical structure division,Chinese Wikipedia corpus training is used.The sentence vector Doc2Vec model extracts the semantic features based on sentence vectors and the similarity measurement features between sentences,and reduces the dimension of sentences by PCA according to 90%retention of information.Finally,because the complex sentences are mostly short text and lack of context information,to a certain extent,the overall represent-ation of clauses will be semantically missing or deviated,so local semantics information can be added as a supplement.Therefore,abstract features based on word vector Word2Vec and TextRank are extracted.The former represents word embedding,the latter extracts weighted keywords in sentences,and calculates the weighted keyword vectors of clauses to represent their local semantic information.Dimension reduction is also done according to 90%information retention.Therefore,through the shallow syntactic features,deep semantic features,and cross-abstract features as three dimensions of complex sentences.The machine learning algorithm is used as the baseline model and the feature weight analysis is performed by feature fusion.The improved two-channel convolutional neural network model with attention mechanism is compared on the CCCS corpus to analyze the prediction accuracy,F1-score,AUC.Through experiments,semantic features,similarity features and vector features with weights have significant influence on the target,and the correct rate of complex sentence hierarchy analysis is 83.1%.So far,the series of research methods based on deep learning proposed in this paper provide a better way for hierarchical structure analysis.
Keywords/Search Tags:Attention mechanism, TextRank, Deep semantics, Hierarchical structure of complex sentences, Convolutional neural network
PDF Full Text Request
Related items