Font Size: a A A

Parallelize Research And Implementation Of Collaborative Filtering Algorithm Based On Matrix Factorization

Posted on:2015-01-14Degree:MasterType:Thesis
Country:ChinaCandidate:Y MiaoFull Text:PDF
GTID:2298330452953459Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
As the time for big data is coming,the network user is confused to the complexinformation in the internet. The recommender system is useful to help users to findtheir interest and requirement,and it will notice users the infomation they want insome way.So the recommender system is valuable both in bussiness and research.Collaborative filtering recommend algorithm is a widely used recommendationtechnique.Collaborative filtering algorithm based on matrix factorization is aefficient collaborative filtering recommendation technique proposed in recent years.Collaborative filtering algorithm based on ALS (Alternating-Least-Squares) is a kindof recommend algorithm which use matrix factorization technique to recommend,itis widly used in the application development.The ALS algorithm has some featurers.In the process of recommendation eachprediction depends on the collaboration of the whole known rating set and thefeature matrices need huge storage. So the recommendation with only one node willmeet the bottleneck of time and resource. As for a recommender system,if it can’tadapt to user’s change of interest and requirment,the user will not be satisfied.So theresearch of ALS algorithm is focused on how to realize it in the distributedcomputing platform efficiently to boost the speed of computing.At now,the most popular ditributed computing and storage platform inapplication development is Hadoop.As the user commit the computing task in aform of a series of MapReduce job,Hadoop will execute the job by distributedcomputing.Through in-depth study on the principle and feature of current parallelimplementaion of a collaborative filtering algorithm based on ALS(Alternating-Least-Squares),we get the reason why the computing efficiency of theimplementation of traditional iterative algorithm on hadoop is very low.According tothe idea of iterative MapReduce,we proposed some methods such as loop-awarescheduling algorithm, static data caching, job loop controlling, fixed point detecting.The experiment on two data sets shows that the iterative MapReduce hasimproved the parallel computing efficiency of collaborative filtering algorithm basedon ALS.And with the data set get bigger,the efficiency is improved moreAt the same time,The iterative MapReduce method we proposed can also beapplied in the parallel implementaion of other iterative data mining algorthms.it canimprove their efficiency too.
Keywords/Search Tags:ALS(alternating least squares), collaborative filtering, hadoop, iterativeMapReduce
PDF Full Text Request
Related items