Font Size: a A A

Research And Application On Method Of Generating SQL Through Natural Language Based On Deep Learning

Posted on:2022-09-21Degree:MasterType:Thesis
Country:ChinaCandidate:Y GeFull Text:PDF
GTID:2518306329990699Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the continuous development of computer technology,all aspects of human society have become closely connected with it.Most of the massive data generated in people's daily production and life are stored in relational databases in electronic form.When accessing these data,it is often necessary to write SQL(Structured Query Language)to operate the database.However,SQL is essentially a computer programming language and writing SQL requires a certain level of expertise,in addition to an understanding of the database schema being accessed.Using natural language to interact with the database to query data can not only save users the time to learn professional knowledge and understand the database schema,but also improve the efficiency of the query.Therefore,how to generate SQL based on natural language to query the database has very important research and application value,and has gradually become one of the hot spots in the field of natural language processing.In recent years,with the success of deep learning techniques,more and more researchers have started to apply deep learning techniques to NL2SQL(Natural Language to Structured Query Language)tasks.Although deep learning techniques have achieved good results in this field,there are still some problems to be solved.For example,when converting natural language to SQL,it is often necessary to identify the database tables and columns mentioned in the natural language,which requires an exhaustive representation of the database schema(the tables and columns in the database and the relationships between them)when encoding the database schema.Moreover,in the task of generating SQL in Chinese natural language,there is a problem that the description in the natural language query is inconsistent with the data stored in the database.To address these issues,this paper makes the following work and contributions from the perspective of improving the accuracy of SQL generation by deep learning models,based on the IRNet model,which are:(1)In this paper,a gated graph neural network(GGNN)is added to the IRNet model for encoding database schema,incorporating global information in the database schema into the word embedding of each table name and column name,enabling the model to perceive more contextual relationships of the database structure when generating database table names and column names in SQL.(2)In this paper,the association relationship between database content and columns is introduced into the model by calculating the attention score between natural language queries and database content so as to match database content as well as the corresponding natural language queries and make it better predict the column names in SQL.(3)In this paper,a pre-trained language model is added to IRNet to handle the problem of generating SQL through Chinese natural language.A cross-lingual pre-training model is used to encode database schemas and natural language,so that the mapping problem between Chinese natural language and English database schemas can be solved to some extent.(4)In this paper,a natural language database query system is developed on top of the enhanced deep learning model.In this system,the user simply selects the database to be queried and enters the corresponding natural language query question,and the system automatically converts the natural language question into a SQL statement and executes it on the corresponding database,then gives the result back to the user.
Keywords/Search Tags:Deep Learning, Natural Language Processing, Natural Language to SQL
PDF Full Text Request
Related items