A Large-Scale Data Object Storage System Based On Master-Slave Architecture

Posted on:2014-08-12

Degree:Master

Type:Thesis

Country:China

Candidate:G L Yi

Full Text:PDF

GTID:2268330422960545

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

With the arrival of the huge amounts of data,“Big Data” gradually comes intopeople’s vision. The characteristics of “Big Data”--Volume and Variety make objectstorage technology becoming a new design way for storage system. Object basedstorage system provides a unified storage space for the user. In the system, each objecthas a unique access ID, which identifies the object set when it is created. The objectstore provides simple interfaces for user: a put interface to upload object and a getinterface to download object. It is widely used because of its simple and easy to use.Currently, some object storage system based on master-slave architecture for largeobject like HDFS, GFS, and their metadata grows linearly with the expansion of theamount of objects; cannot support the storage of small objects for their simple blockdata structure. And other storage systems for small object like HayStack, TFS, cannotsupport large object and do not support update concurrently.In this article, I design and implement an object storage system called LaUDObject.On the basis of ability to perceive the user, LaUDObject can manage large objecteffectively, and support small object at the same time.Major works in this paper includes the following:(1) In order to overcome the expansion of the mapping table between the objectand its location, LaUDObject puts objects into groups, and keep a mapping tablebetween the group and its location. In this way, the system reduces the memory usage inthe primary node effectively. Every object has an object id, the top32bit of id identifiesthe group number, and other bit identifies the sequence number of this object in thegroup.(2) Implements sequence consistency strategy that can support concurrency updateoperation on small objects and improve the efficiency of updates from client.(3) By merging small objects into a large file and building an external index, thesystem can complete the read operation in just one disk access, improving the efficiencyof access to small objects.(4) By perceiving the identity of end user, the system can store the data from thesame user into a group; can improve the efficiency of the overall system.I design some comparisons test scenarios include both large object and small object for LaUDObject, Hadoop, and Cassandra and verified the effectiveness of mywork.

Keywords/Search Tags:

distribute object storage, object group, multi-replica consistency, physical storage structure

PDF Full Text Request

Related items

1	Research On Replica Consistency Based On Ceph Distribute Storage System
2	IStore:a Multi Strategy De-centralized Object Storage System
3	Researchs Of Replication Management On Object Storage System
4	Design And Implementation Of A Distribute Object Storage System
5	Research On The Key Techniques Of The Object-based Storage Controller
6	Research On An HDFS-based Object Storage System In Cloud Computing Environments
7	Research On Distribute Storage Of Replicas Based On Hadoop
8	Research On Verifying Method Of Cloud Object Storage Integrity And Consistency Based On Authenticated Data Structures
9	The Formal Modeling And Optimization For Data Partitioning And Replica Consistency In Distributed Storage Systems
10	Research On Object Storage Optimization Based On QoS