Research And Implementation Of A Large Scale Matrix LU Decomposition And Inversion Algorithm Based On Spark Platform

Posted on:2017-02-27

Degree:Master

Type:Thesis

Country:China

Candidate:X Y Zhao

Full Text:PDF

GTID:2180330485460541

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

Matrix inversion operation is a fundamental building block of many computational tasks in many fields such as machine learning and image processing. With the development of computer science, the matrices to be processed are usually very large and become even larger in the current big data period. Therefore, the parallelization of the matrix inversion operation has become an important research topic.With the development of Spark, machine learning algorithms on Spark have gradually become a popular research topic nowadays. The inversion of a matrix can be computed using many methods, such as LU decomposition, SVD decomposition and QR decomposition. In this thesis, the inversion of input matrix is computed based on LU decomposition because using LU decomposition method for matrix inversion is advantageous in parallelization over other methods. This thesis analyzes Spark platformâ€™s feature and the characteristics of LU decomposition algorithm, then proposes a Spark based parallel matrix inversion algorithm. There are two steps in the algorithm:(1) Spark based large scale matrix LU decomposition algorithm is designed to calculate the input matrixâ€™s LU decomposition. Based on the parallel LU decomposition algorithm, the input matrix is splitted into a set of independent blocks, which are fitted into memory and iteratively processed to get the lower triangular matrix and upper triangular matrix. (2) Based on the triangular matrices in step one and block matrix principle, Spark based triangular matrix inversion algorithm is proposed to invert the two triangular matrices. Therefore the inversion of the input matrix can be computed with the two inverted triangular matrices. In the implementation process, three methods are recommended to optimize efficiency of the algorithm:(1) Cache the intermediate results which are frequently used into memory. (2) Tune the efficiency of shuffle process. (3) Use non-recursive method to implement the algorithm.The algorithm in this thesis is valuable to other Spark based matrix operation researchers. In the experiments, the scalability of our algorithm on Spark is validated in both local mode with multiple cores and cluster mode with multiple computing nodes. The algorithm is compared with the conventional matrix inversion algorithm on Hadoop. Experimental results indicate that the algorithm achieves great scalabilities both on local mode with multiple cores and cluster mode with multiple computing nodes. Compared with the previous Hadoop based algorithm, efficiency has been promoted by 24.6%.

Keywords/Search Tags:

Parallel Computing, LU Decomposition, Matrix Inversion, Spark, MapReduce

PDF Full Text Request

Related items

1	Parallel Computing Matrix Operations
2	Research On Parallel Computing Technology Of CME Parameters Detection Model Based On MapReduce
3	Parallel Construction Technology Based On The MapReduce VoR-Tree Index
4	Parallel Computing Of Spark-based Geospatial Analysis Algorithms
5	Research And Implementation Of Seismic Big Data Parallel Processing System Based On Spark
6	Parallelize Research And Application Of Tensor Decomposition Algorithm Based On GAS Computing Model
7	The Study On The Gravity And Magnetic Inversion Based On Parallel Computing And Deep Learning Algorithm
8	The Design And Implementation Of Community Detect Algorithm Based On Spark For Large Scale And Complex Network
9	Three-Dimensional Modeling And Inversion Of Controlled Source Audio-frequency Magneto Telluric Based On Domain Decomposition
10	On-diagonal Matrix Moore-penrose Inverse Parallel Computing