Font Size: a A A

Automatic Storage Structure Selection System Based On Learned Cost

Posted on:2021-01-03Degree:MasterType:Thesis
Country:ChinaCandidate:? WeiFull Text:PDF
GTID:2428330611999994Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In the design of the database system,the storage structure of the data table theoretically determines the complexity of the data access process,and the choice of the storage structure of the table is determined by the system workload.For example,for write-heavy workloads,database systems based on the LSM storage structure have stronger performance than traditional databases,while for analytical workloads,columnar storage can complete large queries in less time.However,in a hybrid workload,the queries processed on different horizontal partitions of the data table is different,and the queries to be processed on the same partition may change,which causes the optimal storage structure under different data table partitions changes through time.Using a single storage structure or adjusting the storage structure by hand could not make the full use of the storage structure,therefore,this paper proposes an automatic storage structure selection system based on learning cost,which solves the storage engine selection problems and the data layout selection problem for the workload at the same time.Inside the proposed system,this paper designs a machine learning-based cost model for cost comparison across storage engines,and a database performance testing process for building the cost model.Experimental results show that the cost model based on learning can give a more accurate estimate of the operating performance across storage engines.In the workload test using TPC-H,the overall query time of the chosen storage structure is about 35% lower than that of static storage structures,when queries are either transactional or analytical.Therefore,for the hybrid workloads where analytical queries and transactional queries run under different partitions,the overall time of the query is greatly improved by this system,where no human intervention is required.
Keywords/Search Tags:storage structure, self-driving database, hybrid workload, database system, machine learning
PDF Full Text Request
Related items