Research Of System-Level Fault Diagnosis For Hypercube Multicomputer

Posted on:2006-06-15

Degree:Master

Type:Thesis

Country:China

Candidate:T Dong

Full Text:PDF

GTID:2168360155972896

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

In recent years, large-scale multicomputer systems play a key role in computing. However, they are vulnerable to component failures with the increase in number of computing units. In order to maitain high reliability and high availability, system-level fault diagnosis is an important consideration in parallel processing, which is with the purpose of isolating faulty units through the analysis of syndromes obtained by collecting test results. The hypercube structure is a kind of interconnected network model with better structure properties and network parameters. Consequently, many researchers are interested in it and some commercial parallel computers are based on this type of architectures. Due to the properties of hypercubes, such as symmetry, recursive construction, logarithm level diameter compared to the number of nodes in systems etc., many efficient algorithms can be devised by utilizing them. In this thesis, we focus on designing highly efficient fault diagnosis algorithms for hypercubes using the new discovered properties of hypercubes. First, we exhibit the current research situation of system-level diagnosis. Second, the structures and properties of hypercubes are analyzed. We also studied two classical algorithms of system-level fault diagnosis. One is an O(t³+|E|) time complexity algorithm proposed by Sullivan for t-diagnosable systems. The other is provided by Yang, which can reach O(Nlog₂N) time complexity with the help of cycle-decomposition technique. Finally, we present our two new fault diagnosis algorithms for hypercube multicomputer systems. The two algorithms are all based on breadth-first search (BFS). Applying the completely unbalanced spanning tree embedded in hypercubes and its properties to our algorithms, we improve the efficiency of breadth-first search (BFS). By introducing the result shown by Yang that an n-dimensional cube with a set F of at most (4n â€“10) failing processors has a connected component of size â‰¥2n-|F|-3 and the completely unbalanced spanning tree embedded in hypercubes, the first algorithm runs in O(Nlog₂N) time complexity with 4n-7 fault-tolerance capability. The algorithm searches for the largest connected component of a hypercube to isolate faulty nodes, whose time complexity is as good as that of the current best algorithm but with more fault-tolerance capability. The second algorithm is also derived from the breadth-first search method. Inspired by a cycle-decomposition technique, we design a linear time complexity algorithm with 2n-2 fault-tolerance capability through proposing a new tree decomposition technique and utilizing the properties of completely unbalanced spanning tree in hypercubes. In order to enhance the performance of our algorithms, we explore some new properties of hypercubes and apply them to our algorithms. One algorithm is superior to others in fault-tolerance capability and the other reaches the linear time complexity.

Keywords/Search Tags:

Fault diagnosis, Tree decomposition, Hypercube, Fault tolerance, Networks, Algorithm, Parallel computing

PDF Full Text Request

Related items

1	The Research On Fault-Tolerant Model And Routing Algorithms In Hypercube Networks
2	Fault Tolerant Models And Fault Tolerant Routing Algorithms In Hypercube Networks With A Large Number Of Faulty Nodes
3	The Study And Analysis On Fault-Tolerant Parallel Algorithm
4	No Precise System-Level Fault Diagnosis For Interconnection Networks
5	Study And Implementation Of Fault Tolerance For Heterogeneous Parallel Computer
6	Reasch On G-good-neighbor Local Fault Diagnosis And T/k-fault Diagnosis Of Regular Networks
7	Research On Adaption Method Of Cloud Fault Tolerance Services Based On User Requirement And Resource Constriction
8	The Research On Fault Tolerant Multicast In Hypercube Network
9	Research On The Reliability Of The Generalized Hypercube
10	On-line EMC Fault Diagnosis Technology Based On Fault Tree