With the fast development of IT technology,many mission-critical applications in healthcare,education,finance store their information in relational database.Software engineers use SQL statements to insert,delete,update and select data frequently during their software development,and business people often use it to make reports and online analysis(OLAP)in the same way.Although expressive and powerful,the SQL language is essentially a programming language,which requires users to be skilled in programming,expertised in SQL language grammar and familiared with the database schema.How to reduce the learning cost of SQL language?how to generate SQL queries faster and better? And how to generate the SQL query statement by a more natural way? They have become the urgent problems that we attempt to tackle.This paper studies the automatic generation of end-user-oriented SQL query statement and puts forward the technologies and methods of generating SQL query statements from interactive natural language interface(INL2SQL)and natural language(NL2SQL).The main contributions and innovations of this paper include:1)This paper proposes an INL2SQL(Interactive Nature Language to SQL Query Statement)method through mapping.It utilizes the modules of dependency parse tree generation,parse tree node mapping,parse tree optimization and reconstruction,query tree translation to analyze the intention of users' queries and map them to SQL query statements.Meanwhile,through interactive dialog and user interface module,the intention can be reconstructed and fulfilled.Experiments on Classicmodels and MAS datasets show that the accuracy of the interactive model is 100%,80%,35% and 100%,93% and 71% in simple,medium and hard scenarios.It solves the problems of intent missing and ambiguity effectively.2)This paper proposes an NL2SQL(Nature Language to SQL Query Statement)method using deep reinforcement learning.A new neural network model composed of encoder and decoder,combined with self-attention mechanism has been designed and adopted in this method.It also utilizes reinforcement learning to enhance the model by the execution result of SQL statements.In addition,the learning problem of the network model is transformed into the optimization problem of strategies,and the states and actions are defined.In order to solve the order problem of filtering conditions and implicit column problem,this method also proposes a solution with non-deterministic oracle prediction and ANYCOL state.The experiment results show that the proposed method is first-class on WikiSQL dataset.Its accuracy of database execution on ATIS verification dataset is 89.2%,the logical form and database execution accuracy on verification and test datasets of Spider are 23.2% and 24.1% respectively,which outperforms the existing approaches.3)This paper proposes an NL2 SQL method through multitask learning.In order to further improve the accuracy of NL2 SQL generation and solve the problem of generating SQL queries from Chinese natural language,multitask learning technology is adopted.The method first unifies different tasks by TCR(Task-Content-Result)template,and then uses multi-task network model,which composed of encoder and decoder,and dual cooperative attention mechanism to learn simultaneously.Moreover,different optimization strategies,such as fully joint and anticurriculum,are adopted during training.Finally,the logical form and database execution accuracy of this method are 78.7% and 86.1% on WikiSQL datasets,which reaches the highest levels and verifies the effectiveness.The total score is 607.7 by learning more tasks,which shows that the method can solve the problem of generating SQL query statements with Chinese natural language,and meanwhile has good generality and extensibility. |