Self-learning Query Optimization Techniques Under Big Data Platform

Posted on:2024-06-06

Degree:Master

Type:Thesis

Country:China

Candidate:X Chen

Full Text:PDF

GTID:2568307079960329

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

In the past two decades,the rate of data growth has been unprecedented.Therefore,systems for storing,processing,and analyzing data have become common and critical systems.The key to the performance of data systems is the query optimizer,which transforms high-level declarative queries(such as SQL)into efficient execution plans.However,query optimization is very complex and poses two key challenges.Firstly,optimizers use a large number of manually-designed heuristic methods to reduce complexity,but the performance of these methods is not optimal.Secondly,the development cost of query optimizers is very high and human experts may take several months to write the first version,which can take years to perfect.To compensate for the shortcomings of traditional optimizers,machine learning has emerged as a promising direction for improving traditional query optimization due to its powerful adaptability and accuracy.Machine learning can learn from past query execution logs and can be used to predict the optimal execution plan for new queries.Recently,the research on Machine Learning for Database(ML4DB),which empowers databases,has garnered increasing attention and demonstrated the superiority of enhancing traditional database performance in a data-driven manner.Among them,Reinforcement Learning has been applied to construct an autonomous query optimizer that generates query plans and exhibits its advantages in finding competitive query plans without the need for traditional query optimizers.However,these ”alternative optimizer”approaches have not yet been practically implemented.Commercial database vendors still hesitate to incorporate them into their database management systems.This reluctance stems from the overestimation of the capabilities of existing methods in machine learning models.Machine learning models are data-driven,enabling them to learn new data distributions and deploy machine-specific features.Nevertheless,they also pose challenges such as cold start and the inability to learn internal query optimization rules within databases.In order to compensate for the shortcomings of machine learning and leverage the years of development in mature databases,this thesis proposes LEON(ML-aid Ed query Optimizatio N).The self-tuning capability of the expert query optimizer is improved by leveraging the fundamental knowledge in machine learning and the expert query optimizer to adapt to a specific deployment environment.To train the machine learning model,a pairwise ranking objective is proposed in this thesis,which is quite different from the previous regression objective.To help the optimizer get rid of local minima and avoid failures,this thesis proposes an exploration strategy based on ranking and uncertainty,which can discover valuable plans to help the optimizer.In addition,this thesis proposes a machine learning model-guided pruning to improve planning efficiency without compromising excessive performance.Finally,this thesis demonstrates on a wide range of publicly available datasets that the proposed framework can outperform state-of-the-art methods in terms of end-to-end latency performance,training efficiency,and stability.

Keywords/Search Tags:

Machine Learning, Query Optimization, DBMS, Machine Learning for Database

PDF Full Text Request

Related items

1	Research On Database Query Time Prediction Algorithm Based On Deep Learning
2	Optimization Techniques For Distributed Machine Learning Systems
3	Research And Application Of Machine Learning Method Based On Swarm Intelligence Optimization
4	Research And Implementation Of Execution Plan Cache Optimization Based On Machine Learning
5	Research On VM Placement Optimization Algorithm Based On Machine Learning
6	Research On Query Reformulation Based On Machine Learning
7	Research On Optimization And Application Of Extreme Learning Machine
8	Study On The Structure Optimization Of Extrelem Learning Machine And Its Application
9	Partial Learning Machine:Concept,Algorithms And Applications
10	Optimization And Application Of Large Scale Machine Learning System