Research And Improvement Of Chinese NL2SQL Model Based On Single Table

Posted on:2022-11-13

Degree:Master

Type:Thesis

Country:China

Candidate:Y Chen

Full Text:PDF

GTID:2518306749471744

Subject:Automation Technology

Abstract/Summary:

PDF Full Text Request

In recent years,semantic parsing,one of the key technologies of natural language processing,has attracted more and more attention.The NL2 SQL task belongs to semantic parsing.The NL2 SQL task is to convert natural language descriptions into executable SQL queries through models.Due to the difference between Chinese text and English text,the previous NL2 SQL model based on English dataset cannot be directly applied to Chinese text.At the same time,the existing NL2 SQL models generally use the sequence generation model to predict the condition value.The accuracy of the condition value predicted by this method is low,and the importance of data quality and model generalization performance is often ignored when studying NL2 SQL tasks.The innovations of this thesis are as follows:(1)Improvements are made on the basis of SQLNet,and Ch?SQLNet is obtained by adding two tasks,sel?num and where?num,while the structure of SQLNet remains unchanged.(2)In this thesis,Ch?SQLNet and Pre?NL2SQL are divided into 8 sub-tasks according to the structure of SQL statements,and the accuracy of the 8 sub-tasks is compared and experimentally analyzed on the Chinese data set Table QA.Finally,the prediction results of the8 sub-tasks are composed of SQL statements,and the accuracy of query matching is analyzed.The performance of the model is evaluated on two evaluation indicators: the accuracy of the execution result and the.(3)Use special data preprocessing and RDrop regularization to improve the accuracy of Ch?SQLNet and Pre?NL2SQL on evaluation indicators.The experimental results show that: 1.Ch?SQLNet has higher accuracy than SQLNet on8 subtasks,and is 19.1% and 17.2% higher than SQLNet on two evaluation metrics.2.Pre?NL2SQL has higher accuracy than Ch?SQLNet on 8 subtasks,and is 3.6% and 1.7% higher than Ch?SQLNet on two evaluation metrics.3.After special data preprocessing and RDrop regularization,Ch?SQLNet and Pre?NL2SQL improved by 0.1%-0.6% in two evaluation indicators.

Keywords/Search Tags:

Natural Language processing, Chinese SQLNet, Pretraining Model, Model optimization

PDF Full Text Request

Related items

1	Chinese Question Answering Model Based On Paragraph Selection
2	A Technology Of Generating SQL Through Chinese Natural Language Queries Based On Deep Learning
3	Research On Hidden Markov Model For Chinese Natural Language Processing
4	The Methodology And Implementation Of Chinese Natural Language Query In Databases
5	The Design And Implementation Of Chinese Natural Language Query Interfaces For Database
6	Research On Natural Language Programming
7	Chinese Word Segmentation Model Based On Improved Bidirectional LSTM-CRF
8	Study On Chinese Word Segmentation Based On Recurrent Neural Network Language Model
9	Research And Implementation Of Chinese Auto-segmentation System
10	The Chinese Organization Name Recognition Based On SVM And HMM Algorithm