The Design And Application Of SVD Algorithm Based On Spark Platform

Posted on:2016-08-05

Degree:Master

Type:Thesis

Country:China

Candidate:Y Y Ou

Full Text:PDF

GTID:2348330479954619

Subject:Electronics and Communications Engineering

Abstract/Summary:

PDF Full Text Request

In linear algebra, the singular value decomposition(SVD) is an important matrix computation algorithm. SVD is also widely used in signal processing and machine learning, which is used for reducing the dimensionality of complex data sets, principal component analysis, filtering noise and so on. In the era of information explosion, the traditional SVD algorithm can't deal with massive data under the background of big data. The combination of data processing platform and design of efficient distributed algorithm has become a significant and challenge research.Spark, developed by California Berkeley AMPLab, is a memory computing based distributed framework. Compared with the MapReduce distributed computing framework, Spark can well adapt the iterative calculation and efficiently handle the mass of complex data calculation, which is convenient to develop distributed iterative algorithm.In order to address the problem of massive data processing, this article proposes a parallel SVD algorithm in response to large-scale sparse matrix with implementation on Spark platform. Two important problem need to be addressed under the big data processing. Keeping the invariance of the sparsity of data is the first one, and the second one is the convenience and high efficiency to parallelize. To deal with these problems, a SVD algorithm based on Lanczos algorithm, binary algorithm and inverse power algorithm is proposed. Lanczos algorithm is used to transform a real symmetric matrix to a symmetric tridiagonal matrix by orthogonal similarity transformation, which is one of the most effective methods for solving large-scale eigenvalue problem. The binary algorithm and inverse power algorithm respectively for efficiently solving tridiagonal matrix eigenvalue and eigenvector. The experiment based on SVD of Spark platform parallel algorithm in accuracy, efficiency show the results that the algorithm has high efficiency in the large-scale data processing.A new application of SVD algorithm in query recommendation in the field of information retrieval is also proposed in the paper. Using the SVD algorithm, the latent semantic analysis model is constructed by the clicked title text analysis in search engine query log, which is used to calculate the similarity between queries. The results show that the algorithm in query recommendation also has good application effect.

Keywords/Search Tags:

SVD, LSA, Big Data, Spark, Query Recommendation

PDF Full Text Request

Related items

1	Research And Optimization Of Recommendation Algorithm Based On Spark Platform
2	Research And Design Of Micro-Video Recommendation System Based On Spark Big Data
3	An Ad-hoc Query Engine Based On Spark SQL
4	Design And Implementation Of Movie Recommendation System Based On Spark
5	Design And Implementation Of Personalized Tourism Recommendation System Based On Spark
6	Research Of Query Processing Technology For Geospatial Big Data Based On Spark
7	An Item-based Collaborative Filtering Recommendation Algorithm Optimization And Parallel Implementation On Spark Platform
8	Research On Product Recommendation Algorithm Based On Spark Big Data Platform
9	Research And Implementation Of Video Recommendation System Based On Spark
10	The Query Execution Optimization In Spark SQL