Research And Implementation Of Execution Plan Cache Optimization Based On Machine Learning

Posted on:2022-11-11

Degree:Master

Type:Thesis

Country:China

Candidate:F Wang

Full Text:PDF

GTID:2518306764976999

Subject:Automation Technology

Abstract/Summary:

PDF Full Text Request

With the advent of the era of big data,artificial intelligence technology,as an important tool for data processing and analysis,has been applied in various fields of life.As the data carrier of artificial intelligence,the database needs to provide it with faster and more convenient query services.However,with the increasing amount of data,traditional database optimization methods face new bottlenecks.Therefore,finding a more effective database optimization method has become an urgent problem to be solved.Today,artificial intelligence has been extensively studied in database optimization tasks,such as cardinality estimation,join order selection,and execution plan caching.The execution plan cache bypasses the optimizer,directly saves the execution plan of the historical query in the cache,and then allocates the execution plan for the query from the cache.While this approach can save query time,there is currently no efficient way to allocate execution plans from the cache.At present,there have been many studies trying to improve the accuracy of execution plan caching,but most of them have the following three problems.First,they lack an efficient way to extract feature vectors.The feature extraction of binding variables is the key to the classification of execution plans.If the features of binding variables with categorical properties cannot be extracted,it will inevitably affect the accuracy of the model.Second,they are difficult to maintain the execution plan cache dynamically.When the data in the database changes,the saved execution plan may not be suitable for the new parameter space.Therefore,the execution plan cache model needs to be dynamically adjusted to ensure the model prediction accuracy.Finally,the training efficiency of existing models in practical application scenarios is inefficient.Since database systems are all queried online,there is no sufficient data and training time like offline tasks.Therefore,it is necessary to ensure that the model is trained in a short time,and a small number of training samples can be used to achieve a high accuracy rate.In view of the problems in the above methods,the main work of thesis is to propose a method COPC based on machine learning execution plan caching.Specifically,COPC proposes a new data encoding method for the parameter query optimization task.Compared with the defects of traditional data encoding methods,COPC can not only obtain the magnitude information of the parameters in the table,but also obtain the spatial information of the parameters.It solves the problem of insufficient parameter feature extraction ability,in which the magnitude and spatial information can accurately capture the classification features of different parameters.Then,in order to dynamically maintain the execution plan cache,thesis proposes an adaptive random forest algorithm based on this coding method and applies it to the classification task of the execution plan cache,which makes COPC only use a small amount of data and training time.High accuracy,and the model supports incremental training.Then the model proposed in thesis and the baseline model are tested in public datasets,and the experimental results demonstrate the effectiveness of the model COPC in terms of efficiency and accuracy.In addition,thesis also conducts comparative experiments on common machine learning classification algorithms,and the results show the advanced nature of adaptive random forest algorithm in executing plan caching tasks.

Keywords/Search Tags:

Query Plan Cache, Query Optimize, Machine Learning

PDF Full Text Request

Related items

1	Research And Improve On The Query Optimize Of MySQL
2	Research Of Cache-Based Query Optimization Technologies In Swift-Query Tool
3	Research And Design For XML Query Platform Based On XQuery
4	Research On Query Reformulation Based On Machine Learning
5	Query Execution Plan Cache Based On DM Embedded Database
6	Manet-based Mobile Database Buffer And Query Optimization Techniques
7	Research On Generation Method Of Low Cost And High Parallel Query Plan
8	The Research And Implementation Of Indexing And Query Techniques Based On Range Query
9	Research On The Key Technologies For Data Query In Wireless Sensor Network
10	Distributed Database For Read-only Application Of The Model Structure And Query Optimization