Font Size: a A A

Design And Implementation Of Distributed Storage Cluster System

Posted on:2020-01-17Degree:MasterType:Thesis
Country:ChinaCandidate:H XuFull Text:PDF
GTID:2428330605475833Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The application scenario of big data is gradually transformed into the idea of data center architecture,which presents the characteristics of computing marginalization and data storage centralization.Its advantages lie in data sharing,unified storage data format,and reduced storage conversion and transmission costs.As the underlying storage center,distributed storage system faces the challenges of fast data growth and diversity of data formats.At present,there are three major problems to be solved in the design of distributed storage systems:data consistency,data reliability and disk load balancing.Disk data distribution directly affects the load balancing of the cluster accessed.Therefore,how to design an excellent distributed load balancing algorithm for distributed storage system is one of the criteria to measure the system's quality.This paper analyses the existing distributed storage architecture,focuses on the working principle and mechanism of GFS,Ceph and BuleStore,analyses their advantages and disadvantages,and designs Crush routing table based on Paxos algorithm.Crush routing table is a quasi-random routing table,which distributes files in heterogeneous,structured clusters by minimal heap algorithm.Minimum heap algorithm is used to select the least used disk and ensure the reliability of distributed file multi-copy storage on different computers.The design goal of routing table is to optimize data distribution and make full use of distributed storage system resources.When disk devices are added or deleted,data can be organized effectively,and flexible restrictions on the location of data replicas can be imposed,so that data reliability can be guaranteed in the case of partial disk failure.In order to ensure the consistency of Crush routing table when it changes,Paxos algorithm is introduced to ensure the consistency of Crush routing table and state.This paper focuses on data consistency,data reliability and load balancing of distributed storage system.At the same time,this paper applies Crush routing table and designs an efficient distributed storage system.The distributed storage system adopts rector communication mode and introduces EC data compression algorithm to ensure data reliability.In the practical application process,the distributed storage system also designs slow disk avoidance,memory pool optimization and pre-reading optimization according to the characteristics of business data.Based on the Crush routing table based on Paxos algorithm,a distributed storage system is designed and implemented.In the aspect of intra-cluster communication,the system adopts reactor communication mode,designs Crush routing table in the aspect of routing,introduces Paxos algorithm to ensure the consistency of the system,and adopts EC compression algorithm for cold data,which has the characteristics of high reliability,high security,strong consistency,strong disaster tolerance and low cost.The performance and stability of the system under continuous pressure,expansion and downtime scenarios are tested.The test results show that the system performs smoothly in three scenarios.This proves that Crush routing table based on Paxos algorithm has excellent performance.
Keywords/Search Tags:Distributed storage system, crush routing table, crush algorithm, Paxos algorithm
PDF Full Text Request
Related items