Font Size: a A A

Research Of Matrix Factorization Parallelization Based On Graph Computing Model

Posted on:2017-03-24Degree:MasterType:Thesis
Country:ChinaCandidate:S C DaiFull Text:PDF
GTID:2348330485476460Subject:Engineering
Abstract/Summary:PDF Full Text Request
With the increasing popularization of computer and information technology,the scale of the computing system is increasing and the scale of data is increasing rapidly.Especially in the social network and the recommendation in the scene,Because these objects and data is often in the form of graph display,As a result,the importance of graph computer system in machine learning and data mining is becoming more and more important,has rapidly become a hot research in the industry and academia.Matrix Factorization is one of the commonly used methods in recommender system,which is often used to predict the user's preference.However,in the actual application of the scene to achieve the matrix factorization recommended algorithm will need to spend a lot of time and can not meet the needs of the current business,at the same time,The matrix factorization in the implementation process will also encounter the need for multiple iterations of computing.In view of the current problems,In this paper,we propose a parallel implementation of the distributed graph computing platform to solve these problems.The appearance of the GraphX graph framework meets this requirement.Specific studies include the following aspects:(1)For data parallel systems,such as Spark and MapReduce,and other computing framework,In the face of a high degree of data internal correlation computing scene,These parallel computing framework will face enormous challenges,which will bring a lot of computation and data migration,which can consume the cluster resource seriously.Therefore,the paper proposes the use of GraphX graph computing framework to better organization of data and the characteristics of fully mining graph structure,at the same time,optimal computational framework to achieve a better distribution of computational performance.(2)Aiming at the problem of the application of the matrix factorization recommendation algorithm in practical application.In this paper,two kinds of matrix factorization recommendation algorithms based on collaborative filtering are studied.Including stochastic gradient descent(SGD)and alternating least squares(ALS).So the parallel implementation of the GraphX distributed graph computing framework is proposed,at the same time,this two algorithms are compared.Experimental results show that,by GraphX distributed graph computing framework for parallel implementation of these two algorithms,It is found that ALS algorithm is obviously superior to SGD algorithm.Therefore,using Spark's GraphX graph computing framework to achieve matrix factorization in parallel,compared with the traditional MapReduce model,There is a very clear advantage in the implementation of the efficiency of the problem both in the face of multiple iterations.
Keywords/Search Tags:MapReduce, Spark, GraphX Graph Computing Model, Matrix Factorization, Recommender System
PDF Full Text Request
Related items