Design And Implementation Of The Data Storage System Based On Cloud Storage

Posted on:2013-09-28

Degree:Master

Type:Thesis

Country:China

Candidate:Y H Guo

Full Text:PDF

GTID:2248330374499300

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

With the rapid development of information technology,the data storage plays a more and more important role in IT.Nowadays,the amount of data increases exponentially,and as the capacity, price and security constraints, local storage has gradually become powerless.Because of this things,the Cloud Storage is widely used which is based on a distributed file system.The Hadoop Distributed File System(HDFS) is paid more attention rely on its Strong fault tolerance and scalability.Because it is a open source implement of the Google File System(GFS),which is desighed to support google’s search engine,so HDFS do well in search engine that has lots of big file. However, in order to meet the general storage demand,we need do more research and improvement on it.In the business of search engine, files always are very large.However there are lots of files with all kinds of size in the general business.In the HDFS,if there are too many blocks, the performance to access these blocks is bad due to the only one Namenode,which has a performance bottleneck.Although HDFS has so many advantages,it can only do well in little business due to its design purpose.The purpose of this paper is to design and get a distributed file system that can deploy lots of business.It firstly propose a new architecture of distributed file system that has multiple Namenodes in order to solve the shortcoming of HDFS we talked above.The metadata of files are stored in some distributed Namenodes.Namenodes only store the map between the file and the block,and the information of the blocks’ location are stored in the DatanodeManager.This also reduces the Namenode’s load.Then,this paper pays attention to the cluster of Datanode,and describe the its implementations and key algorithms.We improve a new strategy that blocking files.The size of a block can be multiple,instead of one size only.The system could choose one best blocking method depend on the type of the application and the attribute of the files while will be stored in the system.These measures will help us to deployment deploy kinds of applications in the cloud platform and make sure that these applications run with a high performance.

Keywords/Search Tags:

HDFS, Distributed File System, blocking strategy, Loadbalancing, high performance

PDF Full Text Request

Related items

1	Research On Small File Aggregation Strategy And Performance Optimization Based On HDFS
2	Research On File Accessing Performance Optimization Based On HDFS
3	High-performance File Storage And Management System Based On HDFS
4	Research On Storage Strategy Of Distributed File System HDFS
5	Design And Implementation Of A HDFS-Based File Management System
6	Research And Implementation Of Mass Small File Based On HDFS
7	Research On Performance Optimization Technology Of Namenode Based On HDFS
8	Research On High Availability Management Method Of Metadata For HDFS File System
9	The Research For Replica Strategy Using Distributed Parallel File System HDFS
10	Research And Improvement Of Data Check Strategy In Distributed File System