Font Size: a A A

A Research On Reliability Of Distributed System Based On Model Checking

Posted on:2022-06-16Degree:MasterType:Thesis
Country:ChinaCandidate:C Y LuFull Text:PDF
GTID:2518306725485604Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In the era of the mobile Internet data blowout,the single software architecture can no longer meet the huge data processing needs.The distributed software architecture gradually replaces the single software architecture and becomes the mainstream solution for big data processing.A cloud computing platform based on a distributed architecture can provide users with on-demand computing resources through network services,which has become the primary choice for enterprises for business deployment and data processing.Due to the concurrent nature of the distributed system,the system will have bugs in some unexpected places,and these bugs often cannot be reproduced.This makes the use of traditional software testing methods unable to improve the confidence of the technicians in the reliability of the distributed system.In addition,the distributed system utilizes a complex topology structure to coordinate a large number of commodity machine nodes.Therefore,problems such as node faults and network faults in the distributed system are unavoidable.In order to improve the reliability and stability of distributed systems,fault-tolerant mechanisms are often added in the design of distributed algorithm,so that the system can operate normally even in the event of node faults or network faults.However,how to ensure the correctness of the fault-tolerant mechanism is currently a challenge.Model checking technology can model the distributed system and verify all the possible uncertain system behaviors.On this basis,this paper proposes a distributed fault-tolerance model checking(Fault-Tolerance Model Checking,FTMC)method that combines fault injection and model checking.Aiming at the system state space explosion problem,this paper proposes a reduction method(Peer-to-peer Reduction Policy,PRP)based on peer nodes.Aiming at the issue of the fairness of the execution path of the system to be verified for model checking,this paper proposes a fairness-based path sorting(Fairness Path Sort,FPSort)method.Through the above three algorithms,the distributed system protocol and its fault-tolerance mechanisms can be effectively verified.The specific research work of this paper includes:1.The FTMC method that combines fault injection and model checking technology is proposed,which dynamically iteratively searches and executes all possible execution paths of the system to be verified,and then finds the fault injection point,injects faults at a specific location and time.By observing the behavior and status of the system,find all the execution paths that produce the violation Bug.Through experiments,it can be found that the FTMC method can search and verify the correctness of all states and fault-tolerance mechanisms of the system.2.Using model checking needs to search all states of the system to be verified,but many system states are redundant,resulting in state space explosion and the waste of time.To solve this problem,this paper proposes a PRP method that utilizes the redundant path reduction of peer nodes.When searching the system state,the PRP algorithm obtains the current state of all nodes,and finds out redundant system execution paths based on node peering to reduce these redundant system execution paths.Through experiments,it can be found that the PRP method can effectively reduce the redundant execution paths due to peer nodes and improve the efficiency of the FTMC.3.Due to the randomness in the search of the system execution path by the model checking,certain system behaviors cannot be executed for a long time,resulting in the phenomenon of system behavior starvation in the execution paths,which has a low probability in reality.In response to this problem,this paper proposes a fairnessbased FPSort method for path sorting.FPSort uses a virtual clock mechanism to ensure that the hungry system behavior can be searched as soon as possible,so that the execution paths searched first are the most common in the system,ensuring that bugs in these paths can be exposed as soon as possible.Through experiments,it can be found that the FPSort method can effectively search for the execution paths that meets the fairness first,and further improves the time efficiency of the FTMC.
Keywords/Search Tags:Distributed System, Model Checking, Fault Injection, Peer Reduction, Fairness
PDF Full Text Request
Related items