Font Size: a A A

Design And Implementation Of High-speed And High-precision Matrix Operator

Posted on:2021-02-18Degree:MasterType:Thesis
Country:ChinaCandidate:J H QiuFull Text:PDF
GTID:2428330614960255Subject:Integrated circuits and systems
Abstract/Summary:PDF Full Text Request
With the arrival of the era of big data,emerging technologies such as artificialintelligence and cloud computing are widely used.Modern digital signal processing systems need to process complex signals with high dimensions,high accuracy,and high bandwidth in real time.As an important basic operation of signal processing system,matrix operation has a wide range of applications.Matrix inversion is one of the most complex and widely used operations in matrix operations.It has received significant attention from scholars at home and abroad.A large number of effective matrix inversion algorithms have been proposed and verified and implemented through different hardware platforms.The operation amount of matrix inversion increases exponentially with the increase of matrix size,and the hardware implementation has limited resources.Therefore,common matrix inversions usually take special matrices or small-scale matrices as objects.The research on the method and hardware implementation of large-scale arbitrary matrix inversion is relatively rare.In the current era,large-scale non-singular matrix inversion is one of the most challenging and inevitable topics in digital signal processing,and has important practical significance and engineering value.In view of the above problems,this paper has made an in-depth study on the matrix inversion algorithm and its hardware architecture design.The main contents are as follows:?1?This paper analyzes various algorithms for matrix inversion,and selects a matrix inversion algorithm based on Givens-QR decomposition based on factors such as numerical stability,computational complexity,and hardware implementation difficulty.Then,according to the running characteristics of the algorithm,a hybrid granular parallel Givens-QR decomposition algorithm based on in-situ replacement and a block recursive algorithm for inversion of the upper triangular matrix are designed to fully exploit the parallelism of the algorithm.?2?According to the optimized inversion algorithm,a matrix operation hardware accelerator with matrix inversion as the core is designed.The thesis designs a one-dimensional linear pipeline structure based on the two-dimensional pulsating array structure,which effectively compresses the computing resources.The operator can directly accelerate the 2-32 order double-precision floating-point matrix inversion,and is compatible with linear matrix operations,matrix array multiplication,and matrix transpose operations.?3?Complete all the front-end and back-end design work of the matrix operator,and build a verification environment on the Xilinx XC7V2000T FPGA platform and complete the verification.The results show that the matrix operator designed in this paper works under the TSMC28nm process,the main frequency is 700MHz,the chip area is 2.25mm2,and it can complete all the predetermined matrix operation functions.Among them,the 32-order double-precision floating-point matrix inversion takes 14910cycles.The calculation accuracy reaches 10-15,and its speed is 140 times that of NVIDIA RTX2070 GPU.
Keywords/Search Tags:Matrix inversion, hardware acceleration, ASIC implementation, Givens decomposition
PDF Full Text Request
Related items