Font Size: a A A

Performance Optimization In Coding-based Heterogeneous Distributed Storage Systems

Posted on:2019-07-07Degree:MasterType:Thesis
Country:ChinaCandidate:P F LiuFull Text:PDF
GTID:2428330620962234Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
In the era of big data,with the development of network and communications technologies,information interaction is increasingly enhanced.Various network applications and data services,such as video streaming,social network,online storage and mobile payment,are widely used by more and more people in their daily life.This leads to the explosive growth of global data.Facing the huge volume of data storage,large-scale distributed storage systems such as cloud storage provide an elegant solution,such that users can seamlessly access and share their data.Distributed storage systems consist of many cheap storage devices,which are individually unreliable.Node failures are norm rather the exception in a distributed storage system.How to efficiently guarantee the system reliability is urgently to be solved.To provide reliable storage service,distributed storage systems usually introduce redundancy by coding.Most existing works of distributed storage systems focus on a homogeneous model.However,distributed storage systems are heterogeneous in nature.The heterogeneity mainly contains the following two aspects:(1)Different storage nodes have different availabilities,storage capacities,and storage costs;(2)The links between different storage nodes may have different bandwidth and transmission delay.In such a heterogeneous environment,how to improve the storage efficiency and system reliability is an open problem and deserved to be discussed in depth.This thesis is devoted to modeling the heterogeneous distributed storage systems,establishing the optimization framework of system performance,and establishing a fundamental tradeoff between system storage cost and system repair cost.The main contributions of this thesis are as follows:(1)A heterogeneous model with data repair and data reconstruction mechanisms for distributed storage systems is established in this thesis,in which different storage nodes have different storage capacities,storage costs and repair costs.The heterogeneous model is further abstracted as an information flow graph.A tight upper bound of the system storage capacity is given by analyzing the max-flow and min-cut of the corresponding information flow graph.(2)The problem of establishing a tradeoff between system storage cost and system repair cost is formulated as a bi-objective optimization problem subject to the min-cut constraints of information flow graphs.The weighted-sum method is used to find the optimal solution.(3)In the bi-objective optimization framework,the number of min-cut constraints are greatly reduced by analyzing the structural property of feasible objective space,which guarantee the tradeoff between system storage cost and system repair cost can be established in polynomial time.Finally,the accuracy of the evaluation conclusions and the effectiveness of the optimization algorithm are quantified through simulation experiments.
Keywords/Search Tags:Heterogeneous distributed storage systems, Data repair, Regenerating codes, Bi-objective optimization, Tradeoff curve
PDF Full Text Request
Related items