Font Size: a A A

Auto-sharding Technique And Algorithm For Distributed Relation Database Based On SQL History

Posted on:2019-11-05Degree:MasterType:Thesis
Country:ChinaCandidate:G HuangFull Text:PDF
GTID:2428330590475370Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In the face of drastically increasing huge amounts of data,traditional centralized relational databases are facing problems such as low performance at data accessing,poor scalability,and poor concurrency.The distributed relational database based on cloud computing provides data horizontal scalability and distributed parallel processing capabilities,and can adapt to the processing needs of big data.The main problem brought about by horizontal expansion is how to store data in a distributed relational database.The main problem is that how to divide and store the data for the distributed relational database.Although the sharding technology solves the problem of data patition to a certain extent,on the basis of maintaining the characteristics of the relational model in the existing database,how to select a correct and feasible data patition scheme is one of challenge for distribute relational database.Currently,the method of determining the data partition scheme in the industry mainly depends on the administrator's background knowledge,business knowledge,work experience,and subjective judgments,making the implementation of partition scheme for databse highly subjective.Moreover,for an enterprise database that already has a large number of data tables,manually determining the data partition scheme manually is also very costly.Therefore,in order to improve the objectivity and convenience of database partition,after studying the main factors affecting the storage strategy for relational databases,this thesis proposes an SQL-history based database partition strategy.The strategy uses SQL history to obtain database access partten,which can automatically and efficiently get partition scheme and for relational database data.The mainly parts of this paper are listing below:1.Research and use ANTLR tools to implement a simple SQL statement parser,provide corresponding feature values for the generation of data partition scheme.2.Based on the analysis results of SQL history,some changes are proposed to the traditional hierarchical clustering.A data table storage location decision algorithm is proposed to automatically store the database vertically and store it on multiple servers.3.Design a two-stage efficiency model.This model is used to evaluate whether a horizontal partition scheme can cause most of the SQL operations to sink.Based on the Akka distributed computing framework,it explores all appropriate partition strategies for all partitioned tables and automatically shard the data into multiple servers.
Keywords/Search Tags:Distributed Relational database, SQL parse, Vertical Partition, Horizontal Partition
PDF Full Text Request
Related items