Study On The Distribution And Migration Mechanism Of Data In Cloud Storage

Posted on:2017-04-07

Degree:Master

Type:Thesis

Country:China

Candidate:Q Wu

Full Text:PDF

GTID:2308330503979539

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

With the advent of the cloud era, big data(data Big) has also attracted more and more attention of the industry as well as academic. In recent years, data with exponential rate of growth, in order to ensure the data center of foreign service quality, better distributed file storage, the majority of enterprises in the cloud computing(or cloud model) using a particular form and service to access or storage resources to foreign service, dynamic data migration technology to the high traffic data migration to other servers, so that the storage location of the data is very flexible. Therefore, cloud providers to provide the services, through the use of dynamic data migration technology, in order to achieve multiple goals, including income maximization, reduce operational costs, and green it, or ensure that different geographical location of the user’s service demand. Therefore, some recent researches focus on how to design or improve the dynamic data migration algorithm.Under the background of big data analysis and the spread of unstructured data, Hadoop has been unprecedented attention, realization of the distributed file system(Hadoop distributed file system), HDFS is a distributed file storage system. It can carry out the operation of the file, for example, create, delete, move, rename, etc. But the architecture of HDFS is based on a specific set of nodes. These nodes include NameNode(only one); DataNode provides storage blocks for HDFS. While DataNode periodically sends all the existing Block information to the NameNode, the NameNode does not know the data stored in the other Name Nodes.Therefore, in this paper, we further study the relationship between the HDFS system architecture and the nodes, In this paper. At first, this paper presents the NameNodes create mapping-table table and write data migration specific information into the table, the data between nodes in the consistency and dynamic migration, realize the efficient real-time distributed data migration mechanism. Then, based on the Bayesian algorithm, the "cold" data is reasonably deleted. In the application of traditional Bayesian theorem is based on an event, then calculate the probability, but in this algorithm, through analysis on the data. Finally, based on the access to the data quantity and the surrounding host in the presence of the number of data as data deletion of two conditions. Therefore, through Bayesian formula is applied to the expansion, scientific and accurate. At the same time, through the use of probability, the probability of this is from high to low, the "cold" data will be deleted.

Keywords/Search Tags:

Data migration, Big data, Distributed file storage, Extension of Bayesian

PDF Full Text Request

Related items

1	Distributed File System Zd-dfs Design And Implementation
2	Research And Implementation Of Hierarchical Storage Of File-based
3	A Distributed Storage And Computing Platform Based On Bayesian Algorithm
4	Research On Theory And Methods Of Data Placement Optimization In Distributed Storage
5	Research On File System Level Continous Data Protection Technology In Distributed Storage System
6	Research On Data Storage And Search Methods Of Structured Data Based On HDFS
7	Research On Personal Information Fusion System Based On Distributed File Storage
8	Research On Parallel File Systems Based On Heterogeneous Hierarchical Storage
9	On Designing A Data Migration And Deployment System And Its Optimization Study
10	Research On Key Techniques Of Distributed Data Processing And Storage