Font Size: a A A

Research On Technology Of Failure Detection And Replica Placement For High Availability Cluster System

Posted on:2013-12-01Degree:MasterType:Thesis
Country:ChinaCandidate:J F LiFull Text:PDF
GTID:2248330395980559Subject:Military communications science
Abstract/Summary:PDF Full Text Request
At present, the cluster system is widely applied in all walks of occupations. With numerousapplications rurming on the cluster platform,high availability becomes highly desirable. Highavailability cluster can provide highly reliable integrated service for computing tasks to meethardware and soflware faults. High availability cluster needs to address three issues. The firstone is system monitoring, which monitors running system condition in order to obtainavailability of nodes timely and effectively. The second one is system backup copy of the datafor the system recovery, which decides to place many replicas in nodes. The third one is systemrecovery that the cluster system obtains a copy of the available data after the system fails toachieve high availability of the cluster system.For the high availability cluster, there are still some issues that are detection points withhigh cost and low reliability, poor scalability of failure detection algorithms and replicaplacement algorithms.For these issues, the paper briefly presents a systematic analysis of the basic princips,models, main agreements and typical algorithms of failure detection and data replicationtechnology. The current research results and shortcomings of the two technologies applied in thecluster system are in-depth analyzed. The paper presents the following innovative research.Firstly, a reliable detection mechanism based on the ring structure is proposed. Focused onthe problem of a single point of failure faced by the hierarchical failure detection in the clustersystem, this paper in-depth analysis of node failure factors, presents a reliable detectionmechanism based on the ring structure, including ring detection, random semi-confirmation andelection technology. And a reliable detection point for hierarchical failure detection isdesigned.The experiments show that the mechanism could accurately locate and replace thefailed node in the ring, thereby protecting the detection point to provide reliable detectionservices.Secondly, a two-layer failure detection algorithm based on weight is proposed. Aiming atresolving the problem of failure detection overloading and sytem recovery’s cost brought bylarge-scale cluster system, the paper depthly analysis of node criticality, adjusts detectingfrequency of nodes in the cluster system by criticality, provides technical support for scalabilityof the cluster system.Thirdly, a clustering-based asynchronous data replica placement algorithm is proposed. Forthe existing replica placement algorithm bringing high cost of data recovery and updation, thepaper propose a clustering-based asynchronous data replica placement algorithm after in-depthstudy of the replica placement rules. Firstly, all of nodes are divided into a few of uniform partition by the classic k-means method. And then, optimal replica placements are achieved inthe local distincts. Lastly, the global optimization adjustment replica placement is achieved.Theexperiments show that could place replicas reasonably, reduce the cost of data recovery andupdating effectively.Throught the above work, we can optimiae reliability of detection points, reduce detectionload effectively, furtherly reduce cost of recovery and updating, and lastl provide effectivesupport for cluster system.
Keywords/Search Tags:High-availability cluster system, Failure Detection, Ring Structure, Weight, ClusterAnalysis, Data Replication Placement
PDF Full Text Request
Related items