Research On Parallel Solution Method Of Small And Medium Sized Linear Equations Based On GPU

Posted on:2024-04-12

Degree:Master

Type:Thesis

Country:China

Candidate:X L Lei

Full Text:PDF

GTID:2568307064985169

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

In the field of complex system simulation,a large number of small and mediumsized nonlinear simulation model units need to be solved approximately in the form of linearization.For non-positive definite dense matrix,LU decomposition method is usually used.This kind of decomposition based on Gaussian Elimination is very costly,and the calculation process is quite time-consuming,which seriously restricts the speed of simulation.Especially in the current situation where the simulation scene is increasingly complex,this problem is more prominent.Therefore,the efficient solution method of small and medium-sized linear equations is of great significance for the rapid advancement of the simulation.In view of the above problems,this thesis uses the CUDA programming technology launched by NVIDIA to study the parallel solution method of small and medium-sized linear equations,and makes parallel improvement and optimization based on GPU on the traditional LU decomposition algorithm.The main work and contributions include:(1)This thesis designs and implements a high-performance batched parallel LU decomposition algorithm for small and medium-sized dense matrices.The thesis gives eight different versions of batched parallel LU decomposition algorithms,making full use of hardware features such as data reorganization,global memory coalescing access,local variable cache.The algorithms effectively hide the memory access delay and increase the proportion of effective computing time.The experimental results show that with the increase of the number of batches,the performance of the algorithm increases in an approximate linear trend,with the peak value close to 450Gflops/s.Compared with NVIDIA CUBLAS library,the maximum acceleration ratio is close to 18.(2)Based on the LU decomposition algorithm,this thesis designs and implements an implicit parallel algorithm for solving large batch of small and medium-sized linear equations.The algorithm utilizes a right looking parallel back substitution process,which can effectively accelerate the solution speed.The test results of specific cases show that the average solution speed of the algorithm is more than 3 times of the batch linear equations solution API provided by NVIDIA CUBLAS library.The algorithm has implicit real-time parallel solution ability to support million-scale small simulation models.

Keywords/Search Tags:

Parallel computing, LU decomposition, Solution of linear equations, Small and medium dense matrices, CUDA

PDF Full Text Request

Related items

1	Design Of The Solver For Large Scale Linear Sparse Equations Based On CUDA
2	Parallel Algorithms And Architectures For Matrix Computations On FPGA
3	The Study Of Parallel Algorithms For The Solution Of Special Structured Large Linear System Of Equations Under Distributed Memory Environment
4	Study On Dense Stereo Image Matching Based On Parallel Computing
5	On Development Of CAI:Solving Linear Equations
6	Accelerating Typical SVM Algorithms Through CUDA Platform
7	Outsourcing Large-scale Matrix Decomposition To A Public Cloud
8	Parallel Solution Of Finite Element Equations Using Multi-color SSOR-PCG
9	Three Types Of Outsourced Computing On Cloud Computing
10	Implementation Of Two-dimensional DFT Parallel Algorithm On CUDA