Font Size: a A A

Research And Application On Method Of Generating SQL Through Natural Language Based On Interactive Information Editing

Posted on:2024-05-23Degree:MasterType:Thesis
Country:ChinaCandidate:J A ZhangFull Text:PDF
GTID:2568307064497244Subject:Engineering
Abstract/Summary:PDF Full Text Request
As a main stream data storage technology,relational database stores a large number of important data from all walks of life.But users need to have some relevant knowledge to access the data in them.So the demand for ordinary users to access the data in the relational databases through natural language is increasing year by year.With the continuous development of technology,the user’s need of querying the data in the database once through a natural language question has upgraded to the pattern of querying the data in the database multiple times and gradually in depth through multiple sentences of natural language gradually.In order to achieve the above requirements,Natural Language-to-SQL technology has gradually become a hot point in the field of NLP(Natural Language Processing).Through the continuous development of deep learning technology and pre-trained model,the question match accuracy of a single natural language question generate one SQL task has already reached at 80%on the public dataset of Natural Language-to-SQL field,which also known as Text-to-SQL.However,the interaction match accuracy of a set of natural language questions generate multiple SQL task is too low to being used.Through studying the existing works in Text-to-SQL field,we find that improving the ability of the model to parse the interaction history is the research hot point in this field.Current research focuses more on improving the ability of the model to process the context information related to natural language questions and Schema.However the research on SQL history information processing is fewer.Aiming at improving the model’s ability to process SQL context information,we propose the following improvements based on the R~2SQL model,which are:(1)In order to solve the problem that the model generate more and more same SQL tokens during the multi-turn generating work,we used the editing information as the middle generation of R~2SQL model.Then we get the SQL by calculate this editing information with the SQL which was generated at last turn by corresponding rules.The question match accuracy and interaction match accuracy of our model are improved.(2)Aiming at the problem that the existing models’ability of extracting features of last turn SQL is bad,we used the multimodal data in the public dataset to fine-tune a pre-trained model as the SQL encoder in the R~2SQL model to encode the last turn SQL.(3)We studied the relations between the utterances of dataset and editing information.Then we used Glove to embedding these relations and add them to the feature of utterance tokens as the knowledge label,which improve the ability of model to study the feature of result.It slightly improves the Question match accuracy and interaction match accuracy of our model.Based on the above work,we made a system for querying the database interactively though natural language.After chosing the database in the system,users can input natural language question to query database.After querying the database first time,user can chose to continue query the database further after read the feedback,result and interaction history or chose to star a new turn of querying database.In addition,aiming at the interactive requirements of users when they use the system,we also implemented a method to generating natural language description through SQL based on template.After the system generates SQL through natural language question inputted by user,this SQL will be converted to the corresponding natural language description and system will display it to the user.User can confirm whether the system generated the SQL which meets the user’s needs.
Keywords/Search Tags:natural language processing, deep learning, text-to-SQL, pre-trained model
PDF Full Text Request
Related items