Font Size: a A A

Research On Hardware Acceleration Technology For The Matrix

Posted on:2011-03-29Degree:MasterType:Thesis
Country:ChinaCandidate:L GuoFull Text:PDF
GTID:2178330338990020Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Scientific computing are involved in many important applications, such as aeromechanics, modern biomedicine, oil reservoir, environmental science, nuclear simulation. Matrix decomposition and inversion play the base and key role in scientific computing. Therefore, it's of great theoretical significance and practical value to study the acceleration technology for matrix decomposition and matrix inversion.In this paper, we studies the fine-grained parallel acceleration technology for decomposition and inversion of two kinds of matrix. And the detailed work are bellows.(1) Symmetric positive matrix is an important kind of matrix in practical applications, and Cholesky decomposition plays significant role in scientific computing. Firstly, we analyze the data dependency, then propose a fine-grained parallel algorithm and structure for Cholesky decomposition. Finally, we implement a single-precision float-point accelerator for Cholesky decomposition basing the proposed structure, and build its performance model.(2) We study the parallel structure and implementation for LDLT decomposition hardware accelerator of symmetric matrix. Similar to Cholesky decomposition, we first propose the fine-grained parallel algorithm and structure basing the data dependency analysis. Then a single-precision float-point LDLT decomposition accelerator is implemented and the performance model is presented.(3) We study the parallel implementation for matrix inversion basing on Cholesky decomposition. Firstly, the fine-grained parallel algorithms for up-triangular matrix inversion and multiplication are presented. Then we propose a storage methodology for up-triangular matrix to alleviate the bottle of memory accessing. Finally, a single- precision float-point matrix inversion accelerator basing on Cholesky decomposition is implemented and the performance model is given.
Keywords/Search Tags:Scientific Computing, Cholesky Decomposition, LDLT Decomposition, Matrix Inversion, Fine-grained Parallel, FPGA
PDF Full Text Request
Related items