Font Size: a A A

Research On Qos-oriented Failure Detection Service In Distributed Systems

Posted on:2019-04-14Degree:DoctorType:Dissertation
Country:ChinaCandidate:J X LiuFull Text:PDF
GTID:1368330590972788Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
With the development of distributed applications?especially in the military,finance,aerospace and medical fields?,the high availability of distributed systems becomes very important.Failure detector is one of the fundamental components to maintain the high availability of distributed systems.However,in failure detector,there is a contradiction among detection accuracy,speed and overhead.The requirements of high accuracy give rise to reduce the detection speed and increase the overhead,while the high detection speed needs to consume more overhead and bring the lower accuracy.This contradiction is exacerbated with the scale and complexity of distributed systems increasing.Both adaptive failure detection and mechanism of sharing results are the primary methods of resolving this contradiction.Some critical problems related to these two methods are studied in this paper.The results of this paper provide theoretical support for designing high availability distributed systems.In the large scale distributed systems,a lot of distributed applications have different Quality of Service?QoS?requirements of failure detection.Thus the failure detector does not only need to adapt to the changing network environment,but also need to meet the different QoS requirements simultaneously.For the above requirements,Accrual failure detector is an important solution.Because the high availability,work efficiency of the system and implementation of application are guaranteed by accurate failure detection,this paper studies the detection accuracy of Accrual failure detector in different distributed systems.Firstly,network environment presents new characteristics with the development of large scale distributed systems?e.g.Cloud Computing,Internet of Things?.Through the analysis of the behavior of the heartbeat under the different network environments,the results show that the Weibull distribution is a more reasonable distribution assumption for heartbeat inter-arrival time.On this basis,an Accrual detector based on Weibull distribution named WD-FD is proposed,which can provide higher accuracy and shorter detection time with the same overhead.According to the experiments,the performance of WD-FD has been validated and analyzed.In addition,it proves WD-FD implements a failure detector that belongs to?Pac in the partially synchronous model.Secondly,if a large number of mobile nodes exist in large scale distributed systems,the system topology changes frequently and network environments are prone to sudden changes.The single sliding window of Accrual detector cannot adapt well to the sudden changes in network environments.The detection accuracy is affected by the bigger or smaller sliding window.Thus,an Accrual detector based on dual sliding windows named 2WA-FD is proposed by this paper,which can improve the detection accuracy in the highly dynamic network environments obviously.In 2WA-FD,the bigger sliding window is used to deal with the stable network environment,and the smaller one is used to deal with the sudden changes in the network environment.The experimental results also prove that 2WA-FD can detect nodes failure accurately and rapidly.Meanwhile,it has been proved that2WA-FD could implement a failure detector of class?P under a partially synchronous model.The above adaptive failure detectors can improve the accuracy effectively.However,in the distributed systems with limited energy of node,it is a problem to consume the energy of node for adaptive failure detection.Node energy depletion is one of the main reasons for node failure.As a fundamental component for maintaining high availability of the distributed systems,failure detector needs to maintain lots of detection relationships.The node's energy is affected by the computation costs and storage costs of failure detection.On this basis,an energy efficient failure detector 2E-FD based on prediction of last heartbeat is proposed.It does not rely on the maintenance of sliding window,or on the distribution assumption of heartbeat.By simple calculation,2E-FD provides failure detection service with high accuracy and low power consumption.By the experiments,2E-FD has the minimum impact on energy consumption of nodes on the premise of ensuring accuracy.Meanwhile,2E-FD can implement a failure detector of class?P under a partially synchronous model.It is an important method to handle the contradiction among detection accuracy,speed and overhead for adaptive failure detection,but the method is limited for the high requirement of detection speed from real time applications.When the requirement of detectioin speed is high,adaptive failure detection can adjust the sending interval of heartbeat to achieve rapid detection.But,too many heartbeats will increase the overhead and reduce the accuracy.Thus,the study on the mechanism of sharing results is performed.On this basis,a hierarchical failure detection algorithm HS-FD based on the mechanism of sharing results is proposed.By utilizing the features of system architecture,the sharing results can improve the performance of failure detection.In addition,it builds the computational model between detection parameters and QoS metrics.On the premise of ensuring accuracy,HS-FD can achieve faster failure detection with lower overhead.
Keywords/Search Tags:distributed system, failure detection, Quality of Service, adaptive failure detection, detection mechanism of sharing results, Accrual detector
PDF Full Text Request
Related items