Font Size: a A A

Research On Multitask Learning Facilited Vector Representation Algorithms For Queries

Posted on:2022-04-03Degree:MasterType:Thesis
Country:ChinaCandidate:X X ZhangFull Text:PDF
GTID:2518306572950799Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of theoretical research and technology explorations in database systems,the optimization and maintaining complexity of database system has been increasing and the efficiency demand of each submodule and subtask has been more and more strict.While statistical learning methods,especially deep neural networks,expanding their application in various fields,there emerges a new heated research topic of smart database systems which involve theories and methods of statistical learning in database system tasks.Smart database systems facility statistical learning models in smart database configuration,smart database optimization and smart database designing and achieve new breakthroughs in many specific jobs.However,there is common but often neglected problem in smart database systems that how to establish a reliable and efficient expression of a single query task.This expression will make a noticeable boost from the feature side in multiple smart database systems including query optimization,task flow modeling and index structure selection.This thesis focus on query task expression problem of smart database systems and carry on designs,implementations and analyses in different sides containing example generation,training framework,multi-modal feature expression & fusion and multitask learning with SQL language models.In the part of basic framework,this thesis designed and implemented a transformable rapid example generation system and model evaluation framework for inference tasks in smart database systems.Based on TPC-C standard benchmark and Maria DB,this framework supports the generation of any specific number of random,normal and flexible query task and corresponding examples and data preprocessing including histogram features from producer side.In the side of training and evaluation,this thesis implemented a CPU-only feature parsing,model training and model evaluation framework in Tensor Flow which supported the rapid development and evaluation of all sorts of deep neural network models in this thesis.In the part of basic feature representation,this thesis uses deep neural network methods to design a set of highly scalable representation and fusion schemes for the multi-modal features necessary for intelligent database system prediction tasks,which makes the structural information features,semantic information features and auxiliary dense features required for prediction tasks transformable into tensor forms that can be processed and analyzed by deep neural networks.At the same time,three different multi-modal feature fusion methods were designed and compared,among which the auxiliary loss forcing fusion method achieved obvious superior experimental results on the data set.In the part of language modeling of SQL query statements,this thesis implemented and verified the effects of 12 different language models on the SQL query language and systematically analyzed the convergence performance and convergence position of each language model.It is verified that the gated recurrent unit(GRU)and the long short-term memory model(LSTM)is highly adaptable to the learning task of the SQL query language model in smart database systems.In the part of multi-task learning in smart database systems which has been neglected by past research,this thesis designed and implemented a set of multitask learning framework that supported unidirectional isolation of language models and solved the two key points of feature leakage prevention in language model fusion and loss balance between regular tasks.The learning structure framework is tested on real data sets.It was observed that the system not only saved resources than three independent prediction systems,but also improved the convergence effect of intelligent learning tasks at a low risk.In general,this thesis has carried out algorithm research and system experiments from multiple angles for the query representation in smart database systems,which has brought a reliable research foundation for other smart database systems based on query representation.
Keywords/Search Tags:smart database systems, language models, multi-modal feature fusion, multi-task learning
PDF Full Text Request
Related items