Design And Implementation Of High-speed And High-precision Matrix Operator

Posted on:2021-02-18

Degree:Master

Type:Thesis

Country:China

Candidate:J H Qiu

Full Text:PDF

GTID:2428330614960255

Subject:Integrated circuits and systems

Abstract/Summary:

PDF Full Text Request

With the arrival of the era of big data,emerging technologies such as artificialintelligence and cloud computing are widely used.Modern digital signal processing systems need to process complex signals with high dimensions,high accuracy,and high bandwidth in real time.As an important basic operation of signal processing system,matrix operation has a wide range of applications.Matrix inversion is one of the most complex and widely used operations in matrix operations.It has received significant attention from scholars at home and abroad.A large number of effective matrix inversion algorithms have been proposed and verified and implemented through different hardware platforms.The operation amount of matrix inversion increases exponentially with the increase of matrix size,and the hardware implementation has limited resources.Therefore,common matrix inversions usually take special matrices or small-scale matrices as objects.The research on the method and hardware implementation of large-scale arbitrary matrix inversion is relatively rare.In the current era,large-scale non-singular matrix inversion is one of the most challenging and inevitable topics in digital signal processing,and has important practical significance and engineering value.In view of the above problems,this paper has made an in-depth study on the matrix inversion algorithm and its hardware architecture design.The main contents are as follows:（1）This paper analyzes various algorithms for matrix inversion,and selects a matrix inversion algorithm based on Givens-QR decomposition based on factors such as numerical stability,computational complexity,and hardware implementation difficulty.Then,according to the running characteristics of the algorithm,a hybrid granular parallel Givens-QR decomposition algorithm based on in-situ replacement and a block recursive algorithm for inversion of the upper triangular matrix are designed to fully exploit the parallelism of the algorithm.（2）According to the optimized inversion algorithm,a matrix operation hardware accelerator with matrix inversion as the core is designed.The thesis designs a one-dimensional linear pipeline structure based on the two-dimensional pulsating array structure,which effectively compresses the computing resources.The operator can directly accelerate the 2-32 order double-precision floating-point matrix inversion,and is compatible with linear matrix operations,matrix array multiplication,and matrix transpose operations.（3）Complete all the front-end and back-end design work of the matrix operator,and build a verification environment on the Xilinx XC7V2000T FPGA platform and complete the verification.The results show that the matrix operator designed in this paper works under the TSMC28nm process,the main frequency is 700MHz,the chip area is 2.25mm²,and it can complete all the predetermined matrix operation functions.Among them,the 32-order double-precision floating-point matrix inversion takes 14910cycles.The calculation accuracy reaches 10^-15,and its speed is 140 times that of NVIDIA RTX2070 GPU.

Keywords/Search Tags:

Matrix inversion, hardware acceleration, ASIC implementation, Givens decomposition

PDF Full Text Request

Related items

1	Research On Hardware Acceleration Technology For The Matrix
2	Hardware Implementation And Verification Of Matrix Inversion Based On LU Decomposition
3	Research On Key Technologies Of RNN Algorithms Optimization And Hardware Acceleration
4	FPGA-based Matrix Inversion IP Core Design Technology And Related Experi- Ment Platform Design
5	Hardware Implementation Of Sample Matrix Inversion Algorithm Based On FPGA
6	The Circuits Design And Optimization Of Large-Scale Matrix Inversion
7	The Acceleration Of Matrix Inversion Based On Two-dimensional Mesh NoC
8	Hardware Acceleration Design Technology For High Density Computing Many-core
9	Application Of QR Decomposition Techniques In Recursive System Identification
10	Research And Optimization Of Neural Network Acceleration Algorithm