Sparse Matrix Vector Multiplication Based On CPU And GPU

Posted on:2022-01-23

Degree:Master

Type:Thesis

Country:China

Candidate:G Y Yu

Full Text:PDF

GTID:2518306575962249

Subject:Computer system architecture

Abstract/Summary:

PDF Full Text Request

Sparse matrix vector multiplication(SPMV)is of great significance for computation in many science and engineering disciplines.However,due to arbitrary sparse mode and the size of a sparse matrix,the parallelization of SPMV is still plagued by many operational problems,including bad memory merging,thread divergence and load unbalance.For my project,I implemented five different GPUbased algorithms and analyzed their performance on different types and sizes of data.I try to draw insights into the strengths and weaknesses of these algorithms and their best use.I measure performance by calculating throughput,memory bandwidth utilization,and other system-generated metrics.However,in terms of CPU,the number of cores in multi-core processor chips shows a trend of continuous increase,so it is very important to run and parallelize SPMV in multi-core processors.In this paper,two storage formats applied to sparse matrix are introduced,namely CSR format and BCSR format.Meanwhile,the multicore parallelization of SPMV is accomplished through OpenMP.In addition,this paper also studies and analyzes a variety of optimization techniques represented by the register block algorithm,and makes clear the SPMV situation when various scheduling strategies and SIMD instruction sets affect multithreaded parallelism.From the test results on the server side,it can be seen that most of the matrices are extensible and even able to achieve super linear acceleration ratio.Compared with the MULTIthreaded SPMV designed on the basis of CSR,after the register block algorithm is optimized,the speed of SPMV operation can be improved by 28.%.In this paper,it is tested and studied with the three scheduling policies supported by OpenMP standard.The results show that SIMD instruction set can perform better load balancing compared with the scheduling policies supported by OpenMP standard.Both Dynamic scheduling and Guide scheduling show better performance in multithreaded SPMV than Static scheduling policy.

Keywords/Search Tags:

GPU, CPU, parallel computing, sparse matrix, SPMV

PDF Full Text Request

Related items

1	Sparse Matrix Vector Multiplication Based On CPU And GPU
2	Research On Sparse Matrix-Vector Multiplication And Convex Hull Algorithm Based On GPU
3	Computing SpMV on FPGAs
4	Efficient Sparse Matrix Vector Multiplications On New Many-core Architectures
5	The Research Of Hybrid Schedulling Model Based On Sparse Matrix And Parallel Algorithm
6	Parallel Algorithms And Architectures For Matrix Computations On FPGA
7	Design And Verification Of DMA For Sparse Matrix Vector Multiplication
8	Sparse Matrix Computation On Heterogeneous Architectures
9	Sparse Matrix Matrix-Vector Multiplication And Auto-Tuning
10	Optimizing Sparse Matrix-vector Multi Based On OpenCL