Font Size: a A A

Research On Data Consistency And Load Balanceing Optimization Of Distributed Cluster System

Posted on:2020-08-10Degree:MasterType:Thesis
Country:ChinaCandidate:Y L ZhouFull Text:PDF
GTID:2518306110457604Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Data consistency,data reliability and load balancing are three major problems that distributed storage systems need to solve.Data distribution directly affects the performance of cluster load balancing.Therefore,how to design a good distributed load balancing algorithm in distributed storage system is particularly important.It is embodied in the following aspects:(1)Common load balancing algorithm can not guarantee the uniform distribution of disk files,especially in the case of cluster expansion,it is easy to cause uneven data flows between new nodes and old nodes.(2)The consistency of distributed cluster data and state can not be guaranteed,and data inconsistencies are likely to occur when routing tables or cluster States are switched.(3)Traditional open source distributed storage systems can not be optimized for existing business features due to the large amount of code and the complexity of the system.This paper analyzes the existing distributed storage framework structure and its system,focuses on three distributed storage systems: GFS,CEPH,and Bule Store,and studies the working principles and mechanisms of distributed storage systems.The problem of data consistency,reliability and load balancing in distributed storage system is solved.The specific tasks are as follows:(1)The advantages and disadvantages of the three distributed storage systems of GFS,CEPH,and Bule Store are analyzed.A distributed storage system based on the Crush routing table based on the Paxos algorithm is designed to route the Crush algorithm.The Crush routing table is an absolutely random routing table that can reliably and efficiently evenly distribute object copies on heterogeneous,structured clusters.The Crush routing table is calculated by the smallest heap algorithm.The design goal of the routing table is to optimize the data distribution and make full use of distributed storage system resources to effectively organize data when the device is added or deleted.The data security can guarantee the reliability of data under the condition of data related hardware failure.(2)Crush routing tables use consistent storage based on the Paxos algorithm to ensure consistency in data storage,which is a specific application of Crush routing tables.(3)An efficient distributed storage system is designed.The system adopts the director communication mode and can better integrate advanced communication and storage technologies such as RDMA and SPDK.In this distributed storage system,EC algorithm is introduced to ensure data reliability.In the practical application process,the distributed storage system also introduces the features of the business data,such as slow disk evasion,memory pool optimization,pre-reading optimization and so on.In this paper,the performance and stability of continuous pressure,dilatancy,downtime and other scenes are mainly tested.The test results show that the system is stable in three scenarios.From this,it can be proved that Crush routing table based on Paxos algorithm has excellent performance.
Keywords/Search Tags:Distributed storage system, Distributed storage system performance optimization, Crush algorithm, Paxos algorithm
PDF Full Text Request
Related items