Font Size: a A A

Research On Optimization Design Of Regenerating Code Based On Heterogeneous Distributed Storage

Posted on:2019-11-29Degree:MasterType:Thesis
Country:ChinaCandidate:L AiFull Text:PDF
GTID:2428330542499998Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of science and technology,data as a carrier of information,its number continues to grow,so we need to build a huge storage system to meet the growing demand for bandwidth and storage space.With the increase of the system scale,the fault tolerance and reliability of the system have become increasingly prominent.Mass data imposes very stringent requirements on the storage system.lIts storage capacity needs to be larger,its security requirements are higher,storage performance requirements are better,and the cost requirements are lower.Large-scale distributed storage systems are widely deployed and used because of their massive storage capacity,high throughput,high availability,high scalability,and low cost.The regenerating codes combined with the network coding theory as a major encoding technology of the distributed storage system is proposed because it can effectively reduce the repair bandwidth.Considering the disadvantages of the distributed storage system regenerating codes studied in the following aspects,first of all,they are mostly identical distributed storage system regeneratiing codes,ie,the amount of data downloaded from the surviving nodes is the same,and Have the same download cost.Considering the actual situation,the amount of data downloaded from the surviving nodes is often different,and it often has different downloading costs.Second,traditional reproduction codes often have excessive disk I/O during the repair process(disk I/O represents the total number of disk reads during the repair process),and disk I/O is valuable in distributed storage systems.As a precious resource,disk access in disk array systems is usually a bottleneck,so disk I/O should not be too high for distributed storage systems.Finally,distributed storage systems often have high security requirements.We need to design coding methods that meet their security needs.In response to the above-mentioned issues,this paper proposes that in the heterogeneous case(that is,the amount of data downloaded from the surviving nodes during the repair process is not the same,and it has a different download cost)by coding the combination of copying and reproducing codes.The way to perform distributed storage system coding,in this paper it becomes a heterogeneous replication regenerating codes(HRRC).The main work and innovations of this article are summarized as follows:For traditional regenerating codes,there is often too high disk I/O,and because the heterogeneous distributed storage system regenerated code has a higher repair bandwidth than the homogenous distributed storage system regenerating codes.This paper proposes a heterogeneous replication regenerating codes(HRRC).The research mainly includes the establishment of the system model,and the information flow graph.Based on the maximum flow minimum cut theorem,the basic conditions for constructing a regenerative code are deduced.And in this system model,the trade-off between the storage capacity and repair bandwidth of its single node is obtained.In this method,we introduce the concept of download cost.We theoretically analyze that the download cost under this method is lower than the previous one.At the same time,we are concerned with the relevant parameters of distributed storage system regeneration code,including disk I/O and node repair bandwidth simulation and numerical analysis.Simulation results show that the proposed scheme can effectively reduce the disk I/O and repair bandwidth.In view of the fact that distributed storage systems often have high security requirements,security models for wiretapping are studied in this paper.The security of HRRC in distributed storage systems is studied.This article considers that when a single node in the local data center fails,the data in the remote data center will help us complete the repair process,and the eavesdropper will eavesdrop on the data during the repair process.In this paper,the information flow graph in this situation is obtained.Based on the maximum flow minimum cut theorem,the basic conditions for constructing a security regenerating codes are deduced.Based on the system model,this paper derives the trade-off between the storage capacity and the repair bandwidth of a single node under the system model,and the relationship between the node storage capacity and the security level.At the same time,this paper analyzes how to make the distributed storage system have better security level requirements according to the information flow graph and related simulations.The simulation results show that the proposed scheme can bring better security attributes to the system,and it can also be concluded that increasing the storage cost can bring better security attributes to the system.
Keywords/Search Tags:Heterogeneous, DSS, Regenerating codes, Data security
PDF Full Text Request
Related items