Font Size: a A A

A Query Performance Prediction Method For Database Based On Cost Calibration

Posted on:2024-01-01Degree:MasterType:Thesis
Country:ChinaCandidate:Y B WangFull Text:PDF
GTID:2568306923452464Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development of big data technology,database has become an indispensable infrastructure in modern society,which is followed by the increasing demand for database and higher requirements for its performance.In this context,query performance prediction has become an important part of query optimization,which can help database administrators to optimize and adjust database parameters,and provide powerful decision support for improving system performance.Therefore,it is of great significance to establish an effective query performance prediction model to improve the efficiency of database query.In recent year’s,there has been a gradual emergence of query optimizer technology based on learning,and the research of query performance prediction methods has garnered considerable attention.Currently,there are two methods under study for query performance prediction.The first method involves optimizing the traditional cost model within the database in order to predict query performance.Although this method can effectively utilize the traditional cost model to predict query performance,the cost formula designed by the model may not be adaptable enough to accurately predict performance in complex scenarios,resulting in less precise performance predictions.The second method incorporates learning-based query performance prediction,which is divided into two types of models:plan-level and operator-level.In the plan-level neural network model,most models fail to capture the tree structure information of the execution plan effectively,and the characteristic construction of each node is simple,resulting in large prediction errors.Although the operator-level neural network model improves the accuracy of prediction results,the training speed is slow,leading to reduced efficiency.Neither the plan-level nor the operator-level models,based on learning-based method,can solve the "cold start" problem of query performance prediction.In order to address the limitations of the aforementioned methods,the article aims to explore a deep integration of traditional and learning-based methods to complement the shortcomings of each other.This method not only solves the "cold start" problem,but also effectively improves the accuracy of query performance prediction.To achieve this goal,the research is primarily focused on two challenges.Firstly,how to construct a reasonable and effective framework for query performance prediction?Secondly,how to construct a novel learning-based method to further improve the accuracy of query performance prediction?To meet these challenges,the article focuses on studying a database query performance prediction method based on cost calibration.Additionally,the proposed method is integrated into the PostgreSQL database system and visualized in a user-friendly manner.The main work and contributions of the article are summarized as follows:1.A framework of query performance prediction method based on cost calibration is proposed by the article,which effectively combines the traditional cost model with the learning-based model.The framework consists of two parts.The first part is to calibrate the traditional cost model.The regression method is used to calibrate the cost factor of the database in the current hardware environment,and the cost model is used to preliminarily predict the query performance,so as to solve the problem of cold start of the query performance prediction.In the second part,the learning-based method is used to calibrate the query performance.The operator-level model is used to fully explore the performance characteristics of the query plan,and the plan-level model is used to further calibrate the query performance,so as to improve the accuracy of the prediction results.2.A novel query performance calibration method based on deep learning is proposed by the article,which effectively combines plan-level and operator-level models to further improve the accuracy of query performance prediction.This method involves two steps:the operator-level and the plan-level model prediction.In the first step,an innovative differential information feature embedding model is designed at the operator-level.This model is mainly used to model category and numerical features in query plan nodes.By using RNN and FNN to establish a differential information network,the correlation between features and query time is analyzed,node embedding features are generated,and the embedded feature tree is obtained.In the second step,a planned tree convolutional neural network model is constructed.This model extracts tree structure information embedded in the feature tree through the tree convolution kernel,and predicts the deviation value of query performance through the full connection layer.This method combined with the prediction results of the cost model greatly improves the accuracy of query performance prediction.3.A large number of experiments based on IMDB,TPC-H,and SSB public datasets were conducted by the article to verify the effectiveness of the proposed method.Compared with the baselines,it shows that the proposed method achieves better accuracy results in Q-Error and MRE evaluation metrics.Through the ablation experiment,the necessity and rationality of each module are verified.The query performance prediction method is integrated into the PostgreSQL database and presented in a user-friendly visualization.
Keywords/Search Tags:Query Performance Prediction, Cost Model, Query Optimization, Deep Learning
PDF Full Text Request
Related items