Font Size: a A A

Design And Implementation Of Chinese Syntactic Analysis System Based On Phrase Structure

Posted on:2021-01-27Degree:MasterType:Thesis
Country:ChinaCandidate:Y L QiuFull Text:PDF
GTID:2518306107462524Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the accelerated promotion of the commercial degree of Natural Language Processing(NLP),its many key technologies have attracted the attention of many universities and institutions in the past decade.As one of the cornerstones of NLP,syntactic analysis characterizes the internal structure of a statement,which provides a basis for downstream tasks.Based on the differences in formal grammar,syntactic analysis can be roughly divided into phrase structure syntactic analysis and dependency syntactic analysis.As the core module of many applications,phrase structure syntax analysis is of undoubted importance.In the middle of the last century,linguists started the field with rules.With the extensive application and development of machine learning and deep learning,the efficiency and depth of syntactic analysis have been steadily improved in the research and improvement.This paper will study and design the syntax analysis system based on phrase structure.At the beginning of this paper,the previous research process and basic theoretical knowledge of syntactic analysis are described,and then three models and corresponding algorithms related to the system implementation are introduced,including but not limited to context-free Grammar(CFG)model,Probability context-free Grammar(PCFG),and Compositional Vector Grammars.CVG)model,the syntax tree at the level of model data,etc.Then the requirements are analyzed according to the actual functions of the system and the overall operation process.The system is divided into several modules,including data preprocessing module,rule extraction and probability calculation module,CKY(Cocke-Younger-Kasami)algorithm module,as well as deep-learning related word vector and su-rnn(Syntactically United Recursive Neural Networks)modules.These modules have given consideration to cohesion in addition to the distinct gradation,and they can complete the overall function of the analysis system all together.This article goes on to describe the details of the implementation process,as well as the content and results of the testing and evaluation.Based on three optimization models which are deeper one by one,the analysis effect of the system is also gradually improved.The test set and the verification set were used toobtain a comprehensive evaluation of quantitative indicators such as accuracy,recall rate and F1 value,which was roughly in line with the expectation at the beginning of the design.Specifically,in the syntactic analysis of Chinese phrase structure,the basic PCFG model F1 value reached 72.7%,the lexical PCFG model F1 value reached 74.7%,and the CVG model F1 value reached 77.1 with the input of 1000 sentences of 30 to 50 words.
Keywords/Search Tags:syntactic analysis, CFG, PCFG, SU-RNN
PDF Full Text Request
Related items