| The database can provide efficient storage and access to a large amount of data,however,querying the database requires mastering the database query language SQL,which has a certain threshold for ordinary users.Natural language query translation research,translate natural language queries into SQL statements(Text-to-SQL tasks for short),this technology can not only save the cost of learning professional knowledge for users,improve the efficiency of querying data,but also help to build an intelligent database query system,which has important research value.After a period of development,Text-to-SQL technology has made great progress,but there are still some problems: most of the existing data sets are English data sets,and Chinese data sets are relatively lacking;existing research mainly focuses on air ticket subscription,geography,or other fields,but it is rarely involved in the financial field;the natural language query expressions of the existing datasets are relatively simple and not diverse enough,and the SQL query statements are simple and not comprehensive enough;the existing Text-toSQL model Can’t handle some complex natural language query question.In response to the above problems,this thesis constructs a Chinese data set in the financial field,and studies a class of more complex query translation question in a targeted manner.The specific work and contributions of this thesis include:(1)This thesis constructs a Chinese dataset in the financial field(SQL over Financial Text,SOFT dataset for short).Compared with the traditional benchmark datasets,the SOFT dataset proposed in this thesis has the characteristics of more diverse natural language query expressions and more comprehensive and complex SQL query statements,covering grouping and sorting queries,multi-table join queries,nested queries,row calculation queries,etc.Experiments show that the SOFT dataset in this thesis poses new challenges to the current Text-to-SQL model.(2)In view of the characteristics of SOFT datasets,this thesis proposes a Text-to-SQL model RSFSQL in the financial field,which uses RYANSQL as a reference model.Based on the RYANSQL model,the RSFSQL model adds a new method for processing row calculation queries.By analyzing the characteristics of row calculation query question and SQL query statements,a set of "question identification-question decomposition-model predictionassembly based on rules" row calculation query solution is proposed.Experiments and analysis verify the effectiveness of the RSFSQL model in this thesis for processing row calculation queries. |