Font Size: a A A

Re Ase Arch On The Reliability Assurance Technology Of Distributed Storage System For Large-Scale Data

Posted on:2015-03-05Degree:MasterType:Thesis
Country:ChinaCandidate:Z Z XiaoFull Text:PDF
GTID:2298330431983947Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Large-sacle data storage is facing problems of large capacity, complex data structures, heterogeneous infrastructures and failure normalization. An efficient, decentralized metadata management schema plays a vital role in large-scale dis-tributed storage systems. Dynamic and self-adaptive replication mechanism could greatly improve I/O response performance, fault-tolerant capability and storage utilization. Existing works have some defects on scalibility, data migra-tion and data nodes load blancing, etc.Firstly, the hash-based partition schema and tree-based partition schema pay huge cost for expansion, and are sensitive to changes in cluster. In response to these problems, CH-MMS, a consistent hash based metadata management schema, has been proposed. The virtual metadata server is introduced into CH-MMS, and it has good effect for load balance. Combining the standby mechanism with lazy-update policy, CH-MMS achieves fast failover and zero migration when the cluster changes. Due to its distributed metadata structure, CH-MMS has a fast metadata lookup speed. In order to solve the problem that the hash structure may cause damage to file system hierarchical semantics, a simple and flexible mechanism based on regular expression matching has been introduced. The following work is presented:1) Expounding the architecture of CH-MMS,2) Introducing the core data structure of LayoutTable, virtual MDS and lazy-update policy, and their relevant algorithms,3)Doing qualitative analy-sis of scalability and fault tolerance.Secondly, for the inescapability of nodes’failure in distributed storage system and serious defects of static replication mechanism, an evaluation model of file popularity based on the file support degree (SPD) is proposed. SPD can adapt to the variation of system load by adjusting parameters periodically, trying to make the correct decision-making. According to the whole system’s work load, data nodes grouping algorithm is proposed. Within the file support degree model, datanodes balancing algorithm, dynamic file support degree adjusting algorithm and replication clearing algorithm are proposed.Finally, the prototype system and simulation show that, CH-MMS is meta-data-balancing and has fast failover, flexible expansion and zero migration when cluster changes. CH-MMS can meet the needs of flexible, efficient metadata management of large-scale storage systems with increasing data. Experiments also show that SPD model has good effect for datanodes load balacing and is self-adaptive.
Keywords/Search Tags:Lage-scale Data, Distributed File System, Metadata Manage-ment, Consistent Hash, Dynamic Replication Mechanism, Load Balance
PDF Full Text Request
Related items