Font Size: a A A

Research On Key Technologies Of Distributed Storage In Cloud Computing

Posted on:2018-08-21Degree:DoctorType:Dissertation
Country:ChinaCandidate:J LiFull Text:PDF
GTID:1318330542977573Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
With the development of cloud computing technology and the emergence of big data application,the amount of data is increasing rapidly.The storage requirement of massive data brings new challenges to the construction of distributed storage systems.In order to guarantee the high reliability and availability of data,reduce the data access overhead and maintenance cost,this dissertation has made a deeply research from the following aspects,data fault tolerant method based on erasure codes,data availability optimization method in deduplication,replica placement algorithm,and replica consistency maintenance strategy.The main work and contributions are as follows.1.This dissertation proposes a method to speed up the process of disk failure recovery.Storage system often applies erasure codes to protect against disk failure.As a data fault tolerant technology,erasure codes can guarantee the data reliability and reduce storage overhead,so it is widely adopted in large scale distributed storage systems.Liberation code is a type of coding schemes,the encoding and data modification operation is efficient,and it is used in many storage systems.However,Liberation code does not solve effectively how to fast recovery from single disk failure.In this dissertation,the proposed scheme comprehensively considers the two types of parity set,which are P parity set and Q parity set,and uses these parity sets to recover the lost data.The results show that the proposed scheme can reduce the amount of data read from disks,shorten the recovery time,and speed up a recovery process.2.This dissertation proposes a method of data availability optimization in the data deduplication.In storage system,when it adopts deduplication technology,there exists the problem of data unavailability.In order to solve this problem,erasure code and replica technique are used to increase redundant information.This dissertation considers replica technology and analyzes the impact of access frequency and referenced count on the importance of a data chunk.According to the importance of a data chunk,this dissertation proposes an availability probability function to determine redundant degree.Through constructing Lagrange function for the data availability cost function and the storage overhead,we can obtain the optimal redundant degree.The results show that the proposed method can improve data availability and minimize storage overhead.3.This dissertation proposes a replica placement algorithm based on discrete glowworm swarm optimization.When replica technology is adopted in the storage system,replica placement algorithm will not only affect the access efficiency of users,but also have a significant impact on the overall overhead.In order to solve the problem of high access overhead to replica,a replica placement algorithm based on discrete glowworm swarm optimization is proposed in this dissertation.Through mathematical mode establishment for user access overhead,this algorithm can compute fitness function of glowworm position,updated individual position and then obtain appropriate nodes for replica placement.The results show that the proposed algorithm has a better convergence,and it can reduce the data access overhead and the response time.4.This dissertation proposes a tree-based consistency maintenance strategy for replicas.Replication technology can be used in storage system to guarantee reliability and availability,improve data access efficiency and balance system load.However,it brings a problem of replica inconsistency.To solve this problem,this dissertation proposes a replica consistency maintenance strategy.Inspired by Bluetooth network,the concepts of Piconet and master-slave node are introduced to make a more detailed partition of network topology.We then construct a consistency tree,which is composed of master node,and introduce the push and adaptive pull method to transmit the updating message.The results show that the proposed strategy can maintain the replica consistency and significantly reduce redundant message.In conclusion,this dissertation in-depth researches and explores the key technologies of distributed storage in cloud computing environment,and makes a useful contribution to improve the reliability,availability and performance of the distributed storage system.
Keywords/Search Tags:Cloud Computing, Distributed Storage, Erasure Codes, Data Deduplication, Replica Placement, Replica Consistency
PDF Full Text Request
Related items