Font Size: a A A

Research On Scalable Cluster Storage System Based On Object Storage Architecture

Posted on:2006-12-12Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z LiuFull Text:PDF
GTID:1118360185463426Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid advancement of processor and networking technology, computing capability of Linux cluster computing grows rapidly. Linux cluster computing has become popular approach to high performance computing with widespread adoption in high performance scientific computing applications, commercial applications and of high capacity information services. However, traditional shared storage architecture limits the performance potential of these Linux compute clusters. It is very challenging to traditional shared storage architecture to build a scalable, high-performance, cross-platform, secure data sharing architecture meeting the storage demands of Linux compute clusters.The object-based storage architecture is emerging as the foundation for building massively parallel storage systems that leverage commodity processing, networking and storage components to deliver unprecedented scalability and aggregate throughput. Under investigating the advantages of object storage architecture and present object storage systems, the dissertation focuses on the research on large-scale cluster storage system based on object storage architecture and proposes several novel and practical algorithms. Main contribution of this dissertation is as follows:(1) A scalable cluster storage system architecture based on a deterministic pseudo-random algorithm that guarantees a probabilistically balanced distribution of directory and data objects throughout the system is proposed, which simplifies the management of storage systems and supports dynamically balanced scaling of metadata servers and storage nodes.(2) A metadata management method dividing directory path attribute from directory object is firstly proposed, which extends the present object storage architecture. This method avoids efficiently large-scale metadata migration according to updating directory attributes; improves the cache utilization and hit rate by reducing the overlap cache of prefix directory; reduces disks I/O demands by reducing the overhead of traversing the directory path and exploiting directory locality; avoids overloading a single metadata server by dynamic load balancing. Experiment results demonstrate that this method has obvious advantages in improving the throughput, scalability, balancing metadata distribution and reducing metadata migration.(3) Introducing Monte Carlo method in the research of data object distribution firstly, we propose a data object placement algorithm based on dynamic interval mapping, which supports weighted allocation of storage nodes and variable levels of object replication and is probabilistically optimal in both distributing data evenly and minimizing data movement. It efficiently resolves the problem of balancing data distribution in dynamic storage system and...
Keywords/Search Tags:Object Storage, Metadata Management, Data Object Placement, Scalability, Balancing Distribution, High Availability
PDF Full Text Request
Related items