Font Size: a A A

Study On Data Storage In Distributed Storage Systems

Posted on:2021-08-20Degree:MasterType:Thesis
Country:ChinaCandidate:C L YuFull Text:PDF
GTID:2518306470480544Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
With the advent of Internet technology,it has promoted the development of various science and technology,especially the rapid development of information technology.In the face of information technology,how to efficiently and reliably store a large amount of data information generated every day is the focus of our current research.Traditional data storage generally uses centralized storage,but it has the disadvantages of expensive purchase equipment and not easy to expand.It is gradually replaced by distributed storage systems.Distributed systems have the advantages of low cost and good scalability when storing data,and they can be equipped with huge distributed service systems.Replication strategies and erasure coding strategies are usually applied to distributed storage systems to ensure data reliability and validity.The replication strategy is to store the copied original file and effectively protect the data through the copies.The replication strategy is simple and easy to implement,but the storage overhead of the replication strategy is relatively large.The erasure coding strategy is to protect the data by encoding the original file blocks to generate redundant check blocks,which effectively reduces the storage overhead of the system.And the repair process requires decoding operations,so the calculation complexity is higher.Fractional Repetition(FR)codes have been widely studied because of their low repair locality and bandwidth cost during repairing faulty nodes,and the ability to quickly repair faulty nodes without coding.This article focuses on FR codes in distributed storage systems.The main research results are as follows:(1)For the current part of the Fractional Repetition codes construction process is complex and dependent on the parameters,a construction algorithm based on partially repeated codes of graph factorization is proposed,which can select construction parameters and data block repetition within a wide range,and the construction methods have character of diversity.Specifically,performing graph factorization a firs,and then using the decomposed factors to construct a partially repeated code.Different partial repetition codes are constructed according to the graph 1Factorization and graph 2 Factorization.Experimental results show that,the repairing locality,repair bandwidth overhead and repair complexity of FRGF codes is lower,and high repair efficiency,which significantly reduce the repair time of failed nodes.(2)Considering the frequency difference of data accessed in distributed storage systems,a Heterogeneous Variable Fractional Repetition(HVFR)code based on Huffman tree variable repetition is proposed.Specifically,the data blocks of different access frequencies used as the leaf nodes of the Huffman Tree with certain weights,constructing the Huffman Tree and determining the repeatability of the data blocks,and using the pairwise balanced design(PBD)to construct heterogeneous FR codes,which are improving the parallel access speed of hot data and system storage efficiency.Experimental results show that,compared with RS codes and simple Regenerating Codes,HVFR codes can significantly reduce the repair time and repair locality of faulty nodes,improving the parallel access speed of thermal data,and having low computational complexity and simpler construction.
Keywords/Search Tags:Distributed Storage, Fractional Repetition Codes, Factorization, Huffman Tree
PDF Full Text Request
Related items