Font Size: a A A

Study On The Distribution And Migration Mechanism Of Data In Cloud Storage

Posted on:2017-04-07Degree:MasterType:Thesis
Country:ChinaCandidate:Q WuFull Text:PDF
GTID:2308330503979539Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the advent of the cloud era, big data(data Big) has also attracted more and more attention of the industry as well as academic. In recent years, data with exponential rate of growth, in order to ensure the data center of foreign service quality, better distributed file storage, the majority of enterprises in the cloud computing(or cloud model) using a particular form and service to access or storage resources to foreign service, dynamic data migration technology to the high traffic data migration to other servers, so that the storage location of the data is very flexible. Therefore, cloud providers to provide the services, through the use of dynamic data migration technology, in order to achieve multiple goals, including income maximization, reduce operational costs, and green it, or ensure that different geographical location of the user’s service demand. Therefore, some recent researches focus on how to design or improve the dynamic data migration algorithm.Under the background of big data analysis and the spread of unstructured data, Hadoop has been unprecedented attention, realization of the distributed file system(Hadoop distributed file system), HDFS is a distributed file storage system. It can carry out the operation of the file, for example, create, delete, move, rename, etc. But the architecture of HDFS is based on a specific set of nodes. These nodes include NameNode(only one); DataNode provides storage blocks for HDFS. While DataNode periodically sends all the existing Block information to the NameNode, the NameNode does not know the data stored in the other Name Nodes.Therefore, in this paper, we further study the relationship between the HDFS system architecture and the nodes, In this paper. At first, this paper presents the NameNodes create mapping-table table and write data migration specific information into the table, the data between nodes in the consistency and dynamic migration, realize the efficient real-time distributed data migration mechanism. Then, based on the Bayesian algorithm, the "cold" data is reasonably deleted. In the application of traditional Bayesian theorem is based on an event, then calculate the probability, but in this algorithm, through analysis on the data. Finally, based on the access to the data quantity and the surrounding host in the presence of the number of data as data deletion of two conditions. Therefore, through Bayesian formula is applied to the expansion, scientific and accurate. At the same time, through the use of probability, the probability of this is from high to low, the "cold" data will be deleted.
Keywords/Search Tags:Data migration, Big data, Distributed file storage, Extension of Bayesian
PDF Full Text Request
Related items