Font Size: a A A

Research Of Storage Technology Based On HDFS

Posted on:2014-01-15Degree:MasterType:Thesis
Country:ChinaCandidate:Y Z WangFull Text:PDF
GTID:2248330395983988Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Cloud storage is based on cloud computing, is the extension of Distributed Computing、ParallelComputing and Grid Computing.It is somehow the same as cloud computing, they consolidate allstorage devices by the features like cluster application, grid technology and distributed filesystem. For the end user, the cloud storage is not the concrete device, it’s a service, which can provide dataaccess. The core of the cloud storage is the combination of application and storage devices.In manyof the cloud computing platform,the open source project of Hadoop has coused wide publicconcern.The HDFS(Hadoop Distributed File System) is one of the core component of Hadoopwhich study cloud storage technology.HDFS has the advantage of high reliability,strong scalabilityand low cost which make it one of the hot topics in the research of cloud storge.This paper researches cloud storage technology based on HDFS,discusses the improvementmechanism of HDFS according to the characteristic of heterogeneity in cloud storage.Firstly,HDFSconsiders node failure as normalcy and provides the fault node detection mechanism,but hasn’tmake the most use of the historical data.This paper builds a trust model in HDFS which can help themaster node select datanodes that meets the requriements of credit value.Secondly,the data type incloud storage in various and the applications with it are not the same.Different users may havedifferent performance requirements,but HDFS system hasn’t taken the diversity of data intoaccount.This paper puts forwad a data classification model and establishes correspondingperformance selection criteria for every type of data.Thirdly,the current HDFS assumes that nodesin a cluster is homogeneous in nature and all the nodes have same performance. But it is difficult tosatisfy in practical environment and ignore the dynamic changes of node performance.Therefore,this paper puts forward a performance evaluation mechanism which gives a real-timeperformance evaluation value for every node in cluster.So the master node can select the optimalnode for data storage.Finally,through the experiments in Hadoop platform, the effectiveness ofimproved data storage strategy has been proved.
Keywords/Search Tags:Cloud Storage, HDFS, Trust Model, Heterogeneity
PDF Full Text Request
Related items