Font Size: a A A

Research And Implementation Of Heterogeneous Storage In Distributed System Based On CRUSH Algorithm

Posted on:2022-12-21Degree:MasterType:Thesis
Country:ChinaCandidate:J X MiaoFull Text:PDF
GTID:2518306773496484Subject:Automation Technology
Abstract/Summary:PDF Full Text Request
With the promotion of new infrastructure in our country,enterprise data storage requirements show the characteristics of diversity and complexity.Distributed storage mode gradually develops from minor to large data cluster mode.To meet these challenges,implementing heterogeneous media storage has become a new trend in the development of distributed storage.Based on the current distributed network file system storage technology and the development trend of large data cluster mode,this thesis studies and implements a distributed network file system based on the CRUSH(Controlled Dump Under Scalable Hashing)algorithm and breaks through the performance bottleneck through the heterogeneous storage transformation.This thesis proposes a hierarchical improvement design for the storage system based on the unified storage architecture,which divides the distributed network file system into computing and a unified storage layer.This thesis focuses on the research and design of the computing layer,which is mainly responsible for metadata management,user data index,storage hierarchy and so on,and is divided into Proxy,Segment Server and Master Server modules for implementation.This thesis chooses the crush algorithm as the data distribution algorithm on different disk media,optimizes the heterogeneous media storage of the crush algorithm,and proposes brother??rule concept to minimize the amount of data migration during data migration.This thesis also presents a traffic control optimization algorithm based on the CRUSH algorithm,which controls data migration traffic through bandwidth and reduces the impact of migrated data on user traffic.The performance degradation caused by disk failure is solved by a blocklist mechanism based on the CRUSH algorithm.In the design of layered data improvement,the computing layer achieves media heterogeneity by controlling the media that stores data layers.The data is divided into level 0,level 1 and level 2.Level 0,as a new data writing and reading layer,can convert random writing of file system to sequential writing using efficient SSD to improve the writing speed of user data.Level 1 and level 2 use regular HDD to cope with large-scale data storage.Data in level 0 is migrated to level 1 and level 2by Dump after a certain period of time,which ensures cost savings while not affecting read and write performance.This thesis designs and completes the scheme of heterogeneous media storage on the Rhea-FS distributed network file system,gives the design and implementation process of Rhea-FS computing layer in detail,and compares the performance of Rhea-FS and Ceph FS.It is found that Rhea-FS has similar performance to Ceph FS under SSD conditions,and Rhea-FS has better performance under 4K conditions.The main advantage of Rhea-FS studied in this thesis is to build a layered hybrid storage system,which can improve the read-write performance of the system while controlling the cost.The test shows that the cluster with the ratio of SSD to HDD of1:10 can achieve the write performance close to SSD and the read performance of hot data.In addition,Rhea-FS can also be customized more flexibly according to business characteristics,including flow control and data temperature control.Under data migration,Rhea-FS's distributed file system performs better through flow control.In addition,Rhea-FS's unique blocklist mechanism can effectively reduce the impact of cluster failure and upgrade the overall performance.At present,Rhea-FS has been put into use within the enterprise and has gradually replaced the company's Ceph FS products.
Keywords/Search Tags:Distributed Storage, Data Center, Heterogeneous Media, CRUSH Algorithm
PDF Full Text Request
Related items