Font Size: a A A

Research Of System-Level Fault Diagnosis For Hypercube Multicomputer

Posted on:2006-06-15Degree:MasterType:Thesis
Country:ChinaCandidate:T DongFull Text:PDF
GTID:2168360155972896Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
In recent years, large-scale multicomputer systems play a key role in computing. However, they are vulnerable to component failures with the increase in number of computing units. In order to maitain high reliability and high availability, system-level fault diagnosis is an important consideration in parallel processing, which is with the purpose of isolating faulty units through the analysis of syndromes obtained by collecting test results. The hypercube structure is a kind of interconnected network model with better structure properties and network parameters. Consequently, many researchers are interested in it and some commercial parallel computers are based on this type of architectures. Due to the properties of hypercubes, such as symmetry, recursive construction, logarithm level diameter compared to the number of nodes in systems etc., many efficient algorithms can be devised by utilizing them. In this thesis, we focus on designing highly efficient fault diagnosis algorithms for hypercubes using the new discovered properties of hypercubes. First, we exhibit the current research situation of system-level diagnosis. Second, the structures and properties of hypercubes are analyzed. We also studied two classical algorithms of system-level fault diagnosis. One is an O(t3+|E|) time complexity algorithm proposed by Sullivan for t-diagnosable systems. The other is provided by Yang, which can reach O(Nlog2N) time complexity with the help of cycle-decomposition technique. Finally, we present our two new fault diagnosis algorithms for hypercube multicomputer systems. The two algorithms are all based on breadth-first search (BFS). Applying the completely unbalanced spanning tree embedded in hypercubes and its properties to our algorithms, we improve the efficiency of breadth-first search (BFS). By introducing the result shown by Yang that an n-dimensional cube with a set F of at most (4n –10) failing processors has a connected component of size ≥2n-|F|-3 and the completely unbalanced spanning tree embedded in hypercubes, the first algorithm runs in O(Nlog2N) time complexity with 4n-7 fault-tolerance capability. The algorithm searches for the largest connected component of a hypercube to isolate faulty nodes, whose time complexity is as good as that of the current best algorithm but with more fault-tolerance capability. The second algorithm is also derived from the breadth-first search method. Inspired by a cycle-decomposition technique, we design a linear time complexity algorithm with 2n-2 fault-tolerance capability through proposing a new tree decomposition technique and utilizing the properties of completely unbalanced spanning tree in hypercubes. In order to enhance the performance of our algorithms, we explore some new properties of hypercubes and apply them to our algorithms. One algorithm is superior to others in fault-tolerance capability and the other reaches the linear time complexity.
Keywords/Search Tags:Fault diagnosis, Tree decomposition, Hypercube, Fault tolerance, Networks, Algorithm, Parallel computing
PDF Full Text Request
Related items