Font Size: a A A

Research On Dynamic Replica Management Strategy Of Cloud Storage Based On HDFS

Posted on:2019-12-14Degree:MasterType:Thesis
Country:ChinaCandidate:Y BaFull Text:PDF
GTID:2428330542494357Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of science and technology,the new technologies such as cloud computing,social network and Internet of things have brought great convenience to people's work and daily life.At the same time,the number and type of data also show explosive growth.With the advent of the big data era,cloud storage systems are gaining more and more attention from users due to their powerful data management and storage capabilities.In order to improve the reliability,scalability,and security of cloud storage systems,replica technology is widely used.HDFS(Hadoop Distributed File System),as a distributed file system for Hadoop,has powerful data storage and management capabilities.The replica management mechanism of HDFS can improve the availability of cloud storage data,and also improve the reliability,reading efficiency and load balancing of the cloud storage system.However,there are still some defects in the static replica management mechanism adopted by HDFS:(1)In the cloud storage system with high reliability requirements,storing a large number of copies increases system data storage and maintenance costs.(2)The cloud storage system consists of a large number of inexpensive nodes,and node failure is the norm,the HDFS replica management mechanism randomly selects the storage location of the replica,does not consider the data node load conditions and dynamic changes in data access volume,so it affects the load balance of the cloud storage system.In order to solve these problems,this paper proposes a dynamic copy management mechanism DRMS(Dynamic Replica Management Scheme).The main contents of this paper include:(1)Based on the relationship between the availability of data and the number of replicas,this paper dynamically calculates and maintains the minimum number of replicas required to meet replica availability requirements,effectively saving storage space in cloud storage systems.(2)In order to improve the performance and balance load of the system,this paper uses a dynamic copy placement mechanism and uses three replica placement strategies to adapt to different stages and application scenarios.In replica creation stage,user oriented replica placement strategy is adopted.At the stage of copy running,this paper uses a business oriented copy placement strategy,which can satisfy most of the requesters,so as to ensure the high efficiency of the system and data.(3)The replica adjustment strategy saves the system's storage space and reduces the maintenance cost of the system.In this paper,we use gray prediction model to predict the access heat of future data blocks dynamically through recent data access,and dynamically adjust the data replica.If the data block access frequency increases,the number of copies is increased dynamically;If the block access is reduced,the most recent minimum access policy is used to delete the redundant copy,thus saving the storage space of the cloud storage system.
Keywords/Search Tags:Cloud storage, replica technology, HDFS, distributed system, availability
PDF Full Text Request
Related items