Font Size: a A A

Research Of KLU Algorithm Based On GPU: LU Factorization Of Diagonal Blocks

Posted on:2012-03-04Degree:MasterType:Thesis
Country:ChinaCandidate:C W YouFull Text:PDF
GTID:2178330335972284Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
Along with the development of electronic science and technology, circuit simulation is the necessary part for designing high quality circuit. In the process of simulating circuits, the solving of the sparse linear equations is involved. With the increasing scale of circuit matrix, the solving of circuit matrix has become a bottleneck in circuit simulation process. In view of the characteristics of the circuit matrix, produced at circuit simulation process, people usually adopts the direct method for solving such linear equations. Now, the solvers, people commonly used, includes sparsel.3, superLU, KLU. And the KLU developed by Timothy Davis is most efficient.KLU mainly consist of the pretreatment part, the first LU factorization part, refactorization part and back and solving part. And refactorization part is an important part of the algorithm. In a circuit simulation process, people employ exactly multiple calling this part of sparse matrix to complete the numerical LU decomposition. Therefore, this paper mainly research and exploer the parallel algorithm of this part based on GPU platform.In the LU decomposition process, KLU adopt Gilbert-Peierls algorithm, based on the gaussian elimination method. The research have studied the serial algorithm and program, put forward two different parallel ideas, and design and realization o the four different parallel algorithm P_Llen algorithm, PUlen algorithm, P_nl algorithm and P_stream algorithm on the GPU platform, we tested and analysed the performance of the four parallel algorithms in experimental platform I. This pape found that Pstream algorithm has larger advantage in performance than previous three parallel algorith in performance by analyzing. But the P_stream algorithm restricted by the GPU memory limit, the perfomance is lower than serial algorithm For improving the parallel degree of the P_stream, this paper tested and analysed the performance of the algorithm in the experimental platformâ…ˇ, which has the large GPU memory capacity. Through analysis, this paper found that Pstream algorithm performance ascension with the parallel rise, but the performance is not promotec serial algorithm, still limited by the limitation of the GPU memory.Since we are first attempt to research and realize the parallel KLU algorithm on the GPU platform, and the sparseness of sparse matrix data, data dependence at LU factorization, hardware constraints and my poor programming experience, resulting in parallel algorithm performance lower slightly than original serial algorithm. But Some parallel ideas and trying, which we proposed in this paper, can also provide very good reference for the same direction researchers.
Keywords/Search Tags:KLU algorithm, numeric factorization, GPU, parallel algorithm
PDF Full Text Request
Related items