Research On Database Query Optimization Method Based On Deep Reinforcement Learning

Posted on:2023-06-19

Degree:Master

Type:Thesis

Country:China

Candidate:R Z Zhao

Full Text:PDF

GTID:2558306623993799

Subject:Engineering

Abstract/Summary:

PDF Full Text Request

Database as an important support for digital infrastructure construction,query processing and optimization has been an important research direction in the field of database.In online analytical processing(OLAP)scenarios,there are a large number of multi-relational complex analytical queries,and traditional optimizers perform join optimization with high search cost and difficulty in guaranteeing the quality of query plans in the case of nonlinear data distribution.In addition,the cardinality estimation of industrial-grade databases relies on statistical information and independence-specific constant assumptions for estimation,making it difficult to capture current data state changes.In this paper,we propose a dynamic double DQN-based join order optimization method and a GBDT-based query cardinality estimation optimization method to optimize the performance of database queries from two perspectives,taking Postgre SQL,an open-source relational database,as the research object and focusing on the existing problems in database query optimization.The main research contents of this paper are as follows.(1)In query join order optimization,DQN suffers from the problem of over-estimating action values,which can lead to limited query performance.In addition,ε-greedy exploration is not efficient enough and does not enable deep exploration.Therefore,a deep reinforcement learning-based connection order optimization method is proposed to improve the prediction accuracy of the training network by first modeling the connection query as a Markovian decision process and training the neural network model using a weighted double Deep Q-network.Actions are selected by a dynamic progressive search strategy to improve the randomness and depth of exploration to accumulate higher information gain exploration.The introduced tree state representation encoding approach preserves the relational hierarchy information to further improve the effectiveness of the method.After cost estimation for each query plan,a join plan that fits the data distribution and has a balanced query load is selected.(2)In the study of query cardinality estimation,in order to solve the problems of high cost of offline training of deep learning models to estimate the base and the lack of generalization ability to capture database model changes,we propose to use gradient boosting decision tree to predict the base,extract the mapping relationship between the relational features of the query base table and the estimated base,and expand the use of segmented linear regression trees in the model to shorten the training time,due to the lightweight of the model.The model generalization capability is improved by periodically updating the expanded model to capture database change information.In addition,the confidence interval is set by quantile regression to ensure the robustness of cardinality estimation within the interval and avoid abnormal base prediction cases.Finally,the above two optimization methods are applied to Postgre SQL database.The test results show that the cardinality estimation accuracy is improved by about 5times and the average query performance is improved by 32.7% compared with the original Postgre SQL cardinality estimation method.This study extends the generalizability and scalability of the query optimizer and improves the query execution efficiency of the database system based on the improvement of the cardinality estimation accuracy.

Keywords/Search Tags:

query optimization, join order, Cardinality Estimation, database, deep reinforcement learning

PDF Full Text Request

Related items

1	Research On Learning-based Database Query Optimization
2	Cardinality Estimation Based On Multi-Feature Divided Composite Model
3	Join Order Selection Optimization With Deep Graph-based Representation
4	Optimization And Implementation Of Query Method In Distributed Relational Database
5	Research And Implementation Of Adaptive Multi-table Join Cardinality Estimation Optimization Method Based On Correlation Analysis
6	Study On Cardinality Estimation Method Based On Multi-Head Self-Attention Mechanism
7	Deep Autoregressive Model For Cardinality Estimation
8	Research On Cardinality Estimation Of Multi-Attribute Queries Based On Local Depth Autoregressive Framework
9	Research On Optimization Of Multi-table Join Order Of Database Based On Monte Carlo Tree Search
10	Research And Implementation Of Database Query Optimization Based On Graph Neural Network