Font Size: a A A

Research And Application Of NL2SQL In Financial Big Data Analysis

Posted on:2022-01-10Degree:MasterType:Thesis
Country:ChinaCandidate:Y LiuFull Text:PDF
GTID:2518306743463444Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Natural Language to SQL Statement(NL2SQL)is one of the research hotspots in natural language processing task in recent years.Through NL2SQL can realize the interaction between the user and the database,not only make the non-SQL professional users can query the database on demand,but also improve the efficiency of data query.Data to text(Data2Text)is the reverse process of NL2SQL,which automatically generates high-quality human understandable text,reducing the difficulty of communication between users and computers.At present,NL2SQL and Data2 Text related algorithms are developing rapidly,but most of the existing algorithms are English oriented and cannot be applied to the Chinese financial field,which contains large-scale data,and it is difficult for ordinary users to query and understand the data.Therefore,this paper focuses on the application of NL2SQL and Data2 Text technology in the field of Chinese finance.The main work is as follows:(1)The NL2SQL problem under single table query condition is studied.The NL2SQL task was transformed into a slot value filling problem based on the bidirectional long short-term memory model(Bi LSTM).In essence,the NL2SQL task was decoupled into eight classified tasks,and each task was interdependent and joint learning.Expand a single column to multiple columns when predicting the number of SQL selected columns.The conditional value generation model improves the problem that the natural language description is different from the database column value.(2)The problem of NL2SQL in multi-table query is studied.Based on the pre-trained Bert model,NL2SQL tasks are divided into SQL clauses and generation of Join joins.The SQL clause generation idea is similar to the single table query condition,but with the addition of grouping sort prediction.Join joins are generated using Breadth-First-Search.(3)The problem of generating text from data is studied.According to the sequence-to-sequence research method,Bi LSTM was first used to encode the i nput,and then multiple decoders were introduced to determine the decoder tha t generated the text through the implicit variable factor,and the effect of mod el learning was improved by combining the attention mechanism,copy mechan ism and coverage mechanism,etc.This paper builds Chinese financial domain data sets for NL2SQL task an d Data2 Text task under single table and multi-table query conditions respective ly.The experimental results show that the proposed method has higher accurac y in both SQL statement generation and text generation,and NL2SQL is appli ed to a financial quantitative analysis system.
Keywords/Search Tags:NL2SQL, Data2Text, Table, Sigle table, BiLSTM, BERT
PDF Full Text Request
Related items