Font Size: a A A

Design And Implementation Of The Data Storage System Based On Cloud Storage

Posted on:2013-09-28Degree:MasterType:Thesis
Country:ChinaCandidate:Y H GuoFull Text:PDF
GTID:2248330374499300Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of information technology,the data storage plays a more and more important role in IT.Nowadays,the amount of data increases exponentially,and as the capacity, price and security constraints, local storage has gradually become powerless.Because of this things,the Cloud Storage is widely used which is based on a distributed file system.The Hadoop Distributed File System(HDFS) is paid more attention rely on its Strong fault tolerance and scalability.Because it is a open source implement of the Google File System(GFS),which is desighed to support google’s search engine,so HDFS do well in search engine that has lots of big file. However, in order to meet the general storage demand,we need do more research and improvement on it.In the business of search engine, files always are very large.However there are lots of files with all kinds of size in the general business.In the HDFS,if there are too many blocks, the performance to access these blocks is bad due to the only one Namenode,which has a performance bottleneck.Although HDFS has so many advantages,it can only do well in little business due to its design purpose.The purpose of this paper is to design and get a distributed file system that can deploy lots of business.It firstly propose a new architecture of distributed file system that has multiple Namenodes in order to solve the shortcoming of HDFS we talked above.The metadata of files are stored in some distributed Namenodes.Namenodes only store the map between the file and the block,and the information of the blocks’ location are stored in the DatanodeManager.This also reduces the Namenode’s load.Then,this paper pays attention to the cluster of Datanode,and describe the its implementations and key algorithms.We improve a new strategy that blocking files.The size of a block can be multiple,instead of one size only.The system could choose one best blocking method depend on the type of the application and the attribute of the files while will be stored in the system.These measures will help us to deployment deploy kinds of applications in the cloud platform and make sure that these applications run with a high performance.
Keywords/Search Tags:HDFS, Distributed File System, blocking strategy, Loadbalancing, high performance
PDF Full Text Request
Related items