Font Size: a A A

Research On The Technology Of Node Repair And Data Update In Distributed Storage System

Posted on:2023-11-27Degree:MasterType:Thesis
Country:ChinaCandidate:R Y WeiFull Text:PDF
GTID:2568306836463174Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
In the big data era,distributed storage systems gradually replace the traditional storage system with excellent storage performance and low construction cost.The distributed storage system generally stores the stored data in different nodes and uses a redundancy strategy to ensure the security of the data.When the data is wrong,the fault-tolerant mechanism is used to recover the invalid data;When the data is updated,the data is forwarded to different storage nodes through a redundancy strategy.At present,the mainstream redundancy strategies include multiple copies and erasure codes.Compared with multi-copy redundancy,erasure code redundancy has attracted much attention because of its higher space utilization.However,the redundant storage complexity of erasure code is high,so it also has huge computing overhead and transmission overhead when encountering node repair or data update.Although its computing overhead is decreasing rapidly with the development of computer technology,the transmission overhead brought by network transmission is still 10% huge.Facing the problem of high transmission overhead caused by erasure code redundancy,the current mainstream research direction is to build a better transmission topology for the repair process and the update process respectively,but there are still some defects in the research: first,most studies ignore the impact of node performance on transmission efficiency,and do not select a better node to participate in transmission and calculation for heterogeneous networks with distributed storage;Secondly,most studies on transmission delay and transmission traffic are independent of each other,and the trade between them is not trade-off;Finally,most of the existing literature and work solve this problem by designing a local optimal transmission tree,and its performance does not reach the global optimization.Aiming at the above problems,this paper studies the fault repair process and data update process using erasure code redundancy.The main work and innovations are as follows:(1)Aiming at the single node fault repair scenario with erasure code redundancy,a new single-node fault data reconstruction method is designed.In this method,the network link state is collected through software-defined networking(SDN),and the data recovery problem of a single fault node is modeled as the optimization problem of an optimal repair tree with repair traffic and repair delay as the optimization objectives.A hybrid genetic algorithm with K optimal path preprocessing is designed to find the approximate global optimal solution of the problem and realize the trade of two optimizations.The experimental results show that compared with the traditional tree topology and star topology,the repair efficiency of the designed single node repair method is significantly improved.(2)Aiming at the scenario of multi-node data update using hybrid update mode in the application of erasure code redundancy,a new multi-node data update method is designed.This method collects the network status and node performance information through SDN,establishes a multi-attribute decision-making model for the data update node-set,and determines the relay node through the ideal point method,Then,taking the relay node as the center,the multi-node update problem is constructed as the optimal update topology model with the update flow and update delay as the optimization objectives,and the genetic algorithm based on k-shortest-path preprocessing designed in(1)is extended to solve the problem and obtain the approximate global optimal solution The experimental results show that the update method designed in this paper effectively reduces the update cost compared with the classical update scheme.
Keywords/Search Tags:Distributed storage system, Erasure code, Software defined networking, Multi-attribute decision-making, Genetic algorithm
PDF Full Text Request
Related items