Auto-sharding Technique And Algorithm For Distributed Relation Database Based On SQL History

Posted on:2019-11-05

Degree:Master

Type:Thesis

Country:China

Candidate:G Huang

Full Text:PDF

GTID:2428330590475370

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

In the face of drastically increasing huge amounts of data,traditional centralized relational databases are facing problems such as low performance at data accessing,poor scalability,and poor concurrency.The distributed relational database based on cloud computing provides data horizontal scalability and distributed parallel processing capabilities,and can adapt to the processing needs of big data.The main problem brought about by horizontal expansion is how to store data in a distributed relational database.The main problem is that how to divide and store the data for the distributed relational database.Although the sharding technology solves the problem of data patition to a certain extent,on the basis of maintaining the characteristics of the relational model in the existing database,how to select a correct and feasible data patition scheme is one of challenge for distribute relational database.Currently,the method of determining the data partition scheme in the industry mainly depends on the administrator's background knowledge,business knowledge,work experience,and subjective judgments,making the implementation of partition scheme for databse highly subjective.Moreover,for an enterprise database that already has a large number of data tables,manually determining the data partition scheme manually is also very costly.Therefore,in order to improve the objectivity and convenience of database partition,after studying the main factors affecting the storage strategy for relational databases,this thesis proposes an SQL-history based database partition strategy.The strategy uses SQL history to obtain database access partten,which can automatically and efficiently get partition scheme and for relational database data.The mainly parts of this paper are listing below:1.Research and use ANTLR tools to implement a simple SQL statement parser,provide corresponding feature values for the generation of data partition scheme.2.Based on the analysis results of SQL history,some changes are proposed to the traditional hierarchical clustering.A data table storage location decision algorithm is proposed to automatically store the database vertically and store it on multiple servers.3.Design a two-stage efficiency model.This model is used to evaluate whether a horizontal partition scheme can cause most of the SQL operations to sink.Based on the Akka distributed computing framework,it explores all appropriate partition strategies for all partitioned tables and automatically shard the data into multiple servers.

Keywords/Search Tags:

Distributed Relational database, SQL parse, Vertical Partition, Horizontal Partition

PDF Full Text Request

Related items

1	Research On Data Partition Optimization Method Of Shared-Nothing Relational In-Memory Database
2	Research And Implementation Of Data Placement And Query Techniques Based On MapReduce In Distributed Multi-Dimensional Data Warehouse
3	Research On Virtual Partition Strategies Of A Shared Storage Distributed Database
4	Design And Implementation Of Distributed Database Log Management Subsystem Based On Virtual Partition
5	Research On Efficient Outlier Detection Algorithms In DRDB
6	Dynamic Data Partition In Distributed Information Networking Database Management System
7	Research On Functional Dependencies Mining Algorithm Based On Attribute Partition Information Gain
8	A Distributed Vertical Frequent Pattern Ninon Metadata Intemig Algorithm Based Gration
9	Design And Implementation Of A Bank Data Center Based On Virtualization Technology
10	Research On Privacy Protection Mechanism Of Non-relational Database For SaaS