Font Size: a A A

A Large-Scale Data Object Storage System Based On Master-Slave Architecture

Posted on:2014-08-12Degree:MasterType:Thesis
Country:ChinaCandidate:G L YiFull Text:PDF
GTID:2268330422960545Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the arrival of the huge amounts of data,“Big Data” gradually comes intopeople’s vision. The characteristics of “Big Data”--Volume and Variety make objectstorage technology becoming a new design way for storage system. Object basedstorage system provides a unified storage space for the user. In the system, each objecthas a unique access ID, which identifies the object set when it is created. The objectstore provides simple interfaces for user: a put interface to upload object and a getinterface to download object. It is widely used because of its simple and easy to use.Currently, some object storage system based on master-slave architecture for largeobject like HDFS, GFS, and their metadata grows linearly with the expansion of theamount of objects; cannot support the storage of small objects for their simple blockdata structure. And other storage systems for small object like HayStack, TFS, cannotsupport large object and do not support update concurrently.In this article, I design and implement an object storage system called LaUDObject.On the basis of ability to perceive the user, LaUDObject can manage large objecteffectively, and support small object at the same time.Major works in this paper includes the following:(1) In order to overcome the expansion of the mapping table between the objectand its location, LaUDObject puts objects into groups, and keep a mapping tablebetween the group and its location. In this way, the system reduces the memory usage inthe primary node effectively. Every object has an object id, the top32bit of id identifiesthe group number, and other bit identifies the sequence number of this object in thegroup.(2) Implements sequence consistency strategy that can support concurrency updateoperation on small objects and improve the efficiency of updates from client.(3) By merging small objects into a large file and building an external index, thesystem can complete the read operation in just one disk access, improving the efficiencyof access to small objects.(4) By perceiving the identity of end user, the system can store the data from thesame user into a group; can improve the efficiency of the overall system.I design some comparisons test scenarios include both large object and small object for LaUDObject, Hadoop, and Cassandra and verified the effectiveness of mywork.
Keywords/Search Tags:distribute object storage, object group, multi-replica consistency, physical storage structure
PDF Full Text Request
Related items