Font Size: a A A

Research And Implementation Of Translating Natural Language To SQL

Posted on:2021-01-06Degree:MasterType:Thesis
Country:ChinaCandidate:Y J ZhongFull Text:PDF
GTID:2428330632963033Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The research on the natural language interface of the database has been widely studied because it allows non-experts to use natural language for database query,which has high application value.Recently,with the enrichment of data sets,more and more methods based on deep learning have been developed.However,at the same time,the data sets are all in the English domain,and a certain cross-lingual method is required for transferring model the Chinese domain.This paper focuses on the SQL generation task proposed by the English data set WikiSQL.After careful analysis of the data set,it is found that some of condition columns that contain common sense or domain knowledge will not be explicitly mentioned but are implicitly included in natural language query statements.The situation made it hard to predict those conditions.In order to solve this problem,a method without the introduction of external knowledge is proposed,that is using condition value sequences in natural language sequences to assist conditional column guessing task.The main improvements are following two:first,the order of condition value task and condition columns task is switched,so that obtained conditional value sequences can be utilized for guiding condition columns task;secondly,the condition value tasks is viewed as a sequence label task,which not only improves the accuracy of the condition value,but also enables the model to obtain the entire condition value sequence at one time.At the same time,in order to apply the model to the field of Chinese quality inspection,this paper uses machine translation technology to translate the original English data set into Chinese,manually labels a small quality inspection data set according to the quality inspection company database,and uses a cross-lingual pre-training language model to transfer mode.The final experimental results show that the execution accuracy of the model on the test set is improved by 4.9%on English data set,which proves the effectiveness of the method.At the same time,the model achieved an accuracy of 65.7%on the Chinese data set with cross-lingual method.Finally,a prototype of a SQL generation tool for the database of the quality inspection company was developed,and the performance meets expectations.
Keywords/Search Tags:natural language interface, WikiSQL, cross-ligual transfer learning, SQL generation
PDF Full Text Request
Related items