Font Size: a A A

Research On Node Repair Technology Of Distribute Storage System Under The Background Of Multiple Data Centers

Posted on:2022-09-10Degree:MasterType:Thesis
Country:ChinaCandidate:X SuoFull Text:PDF
GTID:2518306554470924Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The big data era poses new challenges to the performance of traditional storage systems.Traditional storage systems have many defects in large-scale data storage.Therefore,distributed storage systems have become the current large-scale storage system with its excellent performance and low construction cost.The mainstream storage system in the field of data storage.However,because the underlying equipment of the distributed storage system generally uses cheap commercial hardware and has a high failure rate,how to ensure the integrity and reliability of the stored data under the scenario where node failure becomes a common problem has become the primary issue.In order to prevent business losses caused by data failures caused by node failures,distributed storage systems ensure data integrity and reliability by sacrificing part of the storage space through redundant data.Common redundancy schemes are divided into erasure codes and multiple copies.Multiple copies are mostly used to store hot data with high access frequency.Erasure codes are often used to store cold data with low access frequency.Most of the existing distributed storage The system supports the mixed use of the two schemes.When some nodes fail and cause data failure,multiple copies can copy other copies nearby to recover the data.Erasure codes need to read other data blocks and encode and decode to recover the invalid data.This process will generate a lot of repair traffic overhead.,And the repair speed is slow.In response to the above problems,this article aims to ensure data security and reliability.The main work and innovations are as follows:(1)A cross-data center data distribution strategy is introduced.This data distribution strategy restricts the distribution of data among multiple data centers and stores a set of erasure codes in multiple data centers.The code blocks stored in each cloud center are less than the recovery requirements.The storage system deployed using this data distribution strategy can tolerate accidents at the data center level,thereby ensuring high data reliability.(2)For the cross-data center single node failure repair scenario using RS code,the characteristics of the tree-type and pipeline repair plan are integrated,so that all nodes in the tree-type repair plan will merge the forwarded data,and consider the data center The heterogeneity of link available bandwidth and node processing capability optimizes the node repair delay.The topology is transformed into a constrained repair model,and the corresponding genetic algorithm is designed to obtain the global approximate optimal solution.The simulation results show that compared with the traditional pipeline repair scheme and the tree greedy algorithm,the repair delay of the reconstructed tree is effectively reduced.(3)For the cross-data center single-node failure repair scenario of the minimum storage regeneration code(MSR),the single-failure node repair topology proposed in(2)is promoted,and auxiliary computing nodes are introduced on this basis,and the design A new single-node failure repair topology scheme,and for the constraints of the cross-data center distribution strategy and the node constraints of the MSR code itself,with the goal of minimum repair delay and minimum additional transmission overhead,the design is in addition to the corresponding genetic algorithm.Solve,and finally realize the trade?off of repairing the delay and extra transmission overhead.Through the comparison of simulation experiments,under the same storage scale,the repair time delay of the repair scheme used in this paper is more optimized than the traditional tree type,and it is also improved compared with the three-layer structure repair tree with the introduction of auxiliary nodes.
Keywords/Search Tags:Distributed storage system, single node repair, RS code, MSR code, genetic algorithm
PDF Full Text Request
Related items