Font Size: a A A

Research On Models Of Generating SQL Statement Through Natural Language Based On Knowledge Graph

Posted on:2022-05-16Degree:MasterType:Thesis
Country:ChinaCandidate:Z W SunFull Text:PDF
GTID:2518306341451914Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of big data,Internet of things and other Internet high technologies,a large amount of personal and enterprise production and operation data will be generated every day,most of which are stored in relational databases.Direct access to relational database data generally needs to query through SQL(Structured Query Language),a kind of database querv language.It requires professional and technical personnel who understand the database language to complete the query operation.NL2SQL(Natural Language to SQL)transforms natural language query into corresponding SQL query statements intelligently with natural language processing technology to help non professional ordinary users query the information from the database.Based on TableQA Chinese dataset,this paper explores the performance improvement of NL2SQL models.The specific innovations are as follows:(1)Proposed the NL2SQL multi-task learning model architecture based on subtask optimization.A three-layer model architecture based on multi task learning framework NL2SQL is designed in this paper,which includes:1)Semantic feature encoder layer for extracting semantic information:category information identifier is innovatively designed in this layer.2)Column representation enhanced encoder layer for enhancing data column expression ability:column representation enhanced encoder model structure is designed this layer;3)subtask model layer for predicting substructures of SQL statement:the optimization algorithms of auxiliary subtask design and subtask fusion are proposed in this layer,which make the model learn the correlation between subtasks more fully and improve the prediction ability of subtask models.Experimental results show that the accuracy of the model on the test set is improved by 7.8%.(2)Proposed the NL2SQL conditional value subtask prediction model based on knowledge graph.The NL2SQL conditional value subtask prediction model is designed in this paper based on the idea of conditional candidate set matching.Compared with the current industry model,it solves the problem of multi-condition values prediction.Meanwhile,this paper selects synonym pairs based on open-source knowledge graph and knowledge base mining.And then constructs a synonym database with the scale of a 350,000,which introduces external knowledge to the model.It solves the problem of inconsistent description of condition value between data table and natural language query.caused by the diversity of natural language expression.Finally,the model achieves 87.74%logical form accuracy and 88.91%execution accuracy on the test set.(3)Designed the NL2SQL intelligent Question-Answer prototype system based on economic investigation data.The NL2SQL model is improved by transfer learning based on the actual open-source data of capital transaction investigation in the field of public security economic crime investigation.The model achieves 84.5%logical form accuracy and 87.0%execution accuracy on the economic investigation data test set.Based on the transferred NL2SQL model,the NL2SQL intelligent Question-Answer prototype system is designed.The visualization of the prototype system is realized through Tkinter code base,and the executable file is packaged through PyInstaller code base.
Keywords/Search Tags:NL2SQL, natural language processing, deep learning, multi-task learning, knowledge graph
PDF Full Text Request
Related items