Research On Deadlock Detection Method In Distributed Systems

Posted on:2019-07-13

Degree:Doctor

Type:Dissertation

Country:China

Candidate:Y Yang

Full Text:PDF

GTID:1368330545972292

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

Distributed systems face the challenges of reliability,security,and scalability while providing powerful service capabilities.Deadlock caused by conflicts between resource allocation and demand is a common software error in resource-sharing systems.Failure to timely handle deadlocks in the system may result in degraded user experience,incorrect results,and reduced system efficiency even system crashes.Common deadlock handling methods include deadlock prevention,deadlock avoidance,and deadlock detection with resolution.Compared with deadlock prevention and deadlock avoidance,deadlock detection and resolution is simple,efficient and feasible.However,unlike centralized systems,distributed systems have no shared memory,no global clock,and soft and hardware failure probability is high.This increases the complexity and difficulty of deadlock handling in distributed systems.At the meanwhile,as an application running in a system,applications for deadlock detection and resolution need to focus on efficiency while ensuring correctness.Through the research and analysis of existing deadlock detection algorithms in distributed systems,we find that most of the existing deadlock detection algorithms in the distributed system focus on the problem of correctness and efficiency when a single process performs a deadlock detection.However,in a real distributed system,multiple processes that are blocked due to resource contention may initiate deadlock detection in parallel.When multiple processes detect the same deadlock in parallel in the system,problems such as pseudo deadlock caused by reaptly detection and resolving the same deadlock,non-optimal deadlock solution,and algorithm efficiency may occur.Inaddition,most of the existing algorithms do not consider the fault tolerance problem when a computing node or process partially or completely fails.Furthermore,distributed systems with mobility features,such as mobile wireless network systems,mobile agent systems,and mobile edge computing systems,further increase the difficulty of deadlock detection algorithm design.This dissertation analyzes the limitations of the existing deadlock detection algorithms in a distributed system and proposes solutions to resolve the shortage of correctness,efficiency,fault tolerance and scalability of the existing algorithm in the case of parallel deadlock detection.The main research content of this dissertation are listed as follows:(1)An improved priority-based distributed parallel deadlock detection algorithm is proposed.In this algorithm,the deadlock detection process with high-priority will reuse the deadlock-related information collected by low-priority processes to reduce the number of transmitted messages.In the meantime,in order to further improving the performance of the algorithm,the proposed algorithm does not transfer deadlock-related information until each process terminates normally.Finally,through the improved strategy of priority comparison,the proposed algorithm chooses only one deadlock detection process that initiates deadlock detection to perform deadlock-related information collection,deadlock detection and resolution to ensure the correctness.Compared with the existing algorithms,our proposed algorithm has better efficiency and correctness guarantee.(2)A distributed fault-tolerant parallel deadlock detection algorithm based on leader election is proposed.Aiming at the problem of communication failure between processes used to detect deadlocks in a distributed system,an algorithm that can tolerate communication failure within a certain range is proposed.Each deadlock detection process maintains the failure record and a list of central control process candidates,and it recommends a new central control process based on the process priority when communication failure happens between it and current central control process.The proposed algorithm retains the efficiency of the centralized deadlock detection algorithm while providing single-process deadlock detection and provides a certain degree of fault tolerance.At the same time,the algorithm also has higher efficiency in the case of multi-process parallel deadlock detection.(3)An improved distributed parallel deadlock detection algorithm for single resource request model in mobile agent systems is proposed.The proposed algorithm incrementally collects the information of deadlock processes to improve the efficiency.The movement of mobile agent and the amount of data carried by all of the mobile agents used to collect deadlock related information is reduced by adopting the improved priority-based strategy and a lazy replying strategy.Finally,the correctness of deadlock detection and resolution is guaranteed by the improved priority-based strategy.(4)A distributed parallel deadlock detection algorithm under the generalized resource model in mobile agent system is proposed.The proposed algorithm adapts the weight distribution,diffusion computation and lazy replying strategies to improve the efficiency.Similarly,in order to ensure that the algorithm detects and resolves the real deadlock,the algorithm also uses a priority-based approach to avoid repeat detecting or resolving of the same deadlock in the system.It also breaks the limitation of the existing deadlock detection algorithms on the resource model(i.e.,Single resource model)and extends the deadlock detection algorithm without sacrificing correctness and performance.In this dissertation,the correctness of the proposed algorithm(i.e.,liveness and safety)is by proved by informal proofs and verified through formal verification based on TLA+(Temporal Logic of Actions)language.The formal verification results show that the proposed algorithm satisfies the liveness and safety property.At the same time,the performance comparison between existing deadlock detection algorithms and the proposed algorithm is also carried out through theoretical performance analysis and simulation experiments.The comparison results show that the proposed algorithm has higher efficiency.

Keywords/Search Tags:

Deadlock Detection, Parallel Detection, Fault Tolerance, Distributed System, TLA+

PDF Full Text Request

Related items

1	Research On Deadlock Detection Algorithm In Distributed Database
2	Research On Cycle And Knot Oriented Distributed Deadlock Detection Algorithm
3	Research Of Efficient Deadlock Detection Algorithm In Distributed System
4	The Design And Implementation Of Fault Tolerance System In The Distributed Storage System
5	Research On Key Technologies Of Fault Tolerance Of Large Scale Distributed Simulation System
6	The Research And Implementation Of Distributed Storage System Fault-tolerance Mechanism
7	Fault Tolerant Technology Based On Autoecious Fault Detection In Distributed Paralled Server System
8	Study On Fault-Tolerance Mechanism And Realization In Real-Time Distributed Computer Systems
9	DISTRIBUTED DEADLOCK DETECTION IN DISTRIBUTED DATABASE SYSTEMS
10	Study And Realization Of Fault-tolerant Technology For Parallel Computer On Satellite