Font Size: a A A

Research On The CCG-based Algorithm Improvement And Combination Models For Semantic Parsing

Posted on:2016-07-07Degree:MasterType:Thesis
Country:ChinaCandidate:Y B ZhuFull Text:PDF
GTID:2308330464464471Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
As an important research problem in the field of natural language processing (NLP), semantic parsing is the task of mapping a natural language (NL) sentence into a complete, formal meaning representation which is comprehensible and executable for computer. Semantic parsing technology has a broad application prospect in the robot interaction, database natural language interface, and other fields.Combinatory Categorial Grammar (CCG) provides a favourable structure to build corresponding relationship between the syntactic representation and semantic representation of natural language, therefore among numerous research methods of semantic parsing CCG semantic parsing method has received abroad attention. However, in the research method of CCG semantic parsing, due to the inexact search strategy adopted in the existing decoding algorithm, the convergence of parameter training algorithm is not guaranteed in theory.On the other hand, research on combination models of syntactic analysis has attracted more attention in recent years and has achieved a good performance. However, at present in semantic parsing field, there has no relevant literature reports.Therefore, in this thesis, we perform a study on improvements of CCG semantic parsing methods and combination models for semantic parsing. The corresponding research involves two major tasks:First, the improvements of CCG semantic parsing learning algorithm. We take advantage of the max-violation perceptron to improve the UBL algorithm (a CCG semantic parsing algorithm based on high order unification). The max-violation perceptron effectively solve the problem of efficiency and convergence of structured perceptron parameter learning algorithm under the condition of inexact decoding, and this algorithm has obtained rigorous proof in theory. In terms of parameter updating, we define max-violation counterpart subtree pair between correct parse set and wrong parse set, and update the parameter vector according to this subtree pair. In terms of latent variable’s acquisition, we put forward the forced decoding based on MR subexpression matching criterion, in order to efficiently get the correct parse trees of training instances.The experimental results on Geo880 data set show that the improving method proposed in this thesis can improve the accuracy and time efficiency of the baseline system.Second, the research on combination models for semantic parsing. According to the combination methods for syntactic analysis, and combined with the characteristics of semantic parsing, we design two combination methods for semantic parsers: similarity-based method and naive Bayes method. Several basic semantic parsers are trained in the first place, and then we combine the output MRs of basic semantic parsers or choose one from these outputs. The experimental results indicate that the performances of these two combination systems are significantly better than the best individual parser, besides, the naive Bayes method is better than the similarity-based method.
Keywords/Search Tags:Semantic Parsing, Combinatory Categorial Grammar(CCG), Max-violation Perceptron, Syntactic Analysis, Combination Models
PDF Full Text Request
Related items