Font Size: a A A

Automatic Database Tuning For Multi-system Settings

Posted on:2021-01-14Degree:MasterType:Thesis
Country:ChinaCandidate:Z LiuFull Text:PDF
GTID:2428330611457227Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The relational database system has a large number of configuration parameters for I/O optimization,parallel computing,query planning,memory allocation,and logging,etc.It is difficult for ordinary users and even database experts to tune it to obtain good performance.Although there are many operational guidelines available for database tuning,the guidelines cannot always provide superior performance for the database.Although the guide can be used to perform well in specific database usage scenarios,they are not suitable for all scenarios.In this case,many organizations have to hire expensive manpower to tune database systems.At the same time,with the rapid development of information technology,the interdependence and influence of multiple systems in the background system are becoming more and more prominent.Previous studies on performance,availability,and reliability were mainly for a single system,single node or multiple nodes,and relatively ignored the overall optimization goal of multiple systems.The purpose of this study is to propose a tuning method that can be effectively extended to multiple systems based on the tuning of a single system and develop a set of tuning tools.The contribution of this article is divided into the following three aspects: First,build an tuning model,and build a data warehouse DuoSQL based on the relational database PostgreSQL and the distributed computing framework Spark to separate a storage and computing and elastic resource configuration into the tuning model And set it as the tuning object of this article.Based on the fact that PostgreSQL and Spark in DuoSQL interact with each other,the two parameters are unified into the multi-system-oriented tuning model,and the overall impact of the system parameters on the multi-system-oriented database is considered.A database tuning model for multiple systems is built on the basis of tuning a single database system.The second is to perform dimensionality reduction and clustering on performance measurement indicators based on the tuning model to make it non-redundant data.And the configuration parameters of PostgreSQL and Spark are sorted by weight,which reduces theamount of noise in the data and improves the quality of the data.In the tuning model,the system's internal workload is mapped and matched to the workload of the system.Gaussian process regression and gradient descent are used to help the tuning model recommend configuration parameters under excellent performance.Finally,in the case of DuoSQL's large-scale data processing,when data interaction is required between PostgreSQL and the Spark cluster,the proportion of time-consuming phases in the operation is too large.Therefore,we classify according to the parameter space of different systems and divide it into network resource parameter space and computing resource parameter space.Based on the parameter classification,a collaborative tuning strategy is formed for different use scenarios to perform more fine-grained tuning,thereby optimizing query time.And the structure of the proportion of time in each stage.
Keywords/Search Tags:PostgreSQL, Spark, Configuration Parameter, Machine Learning, Collaborative Tuning
PDF Full Text Request
Related items