Font Size: a A A

Research And Optimization Of Recommendation Algorithm Based On Spark Platform

Posted on:2017-06-19Degree:MasterType:Thesis
Country:ChinaCandidate:Q Y WuFull Text:PDF
GTID:2428330596457377Subject:Engineering
Abstract/Summary:PDF Full Text Request
A large numbers of information have been produced in the process of rapid development of the internet,it has a great significance to dig out the valuable information from the data information.The research and development of big data began to rise under this background.The computing model of Hadoop MapReduce has attracted extensive attention of scholars.However,The Spark with RDD(Resilient Distributed Datasets)data model is more prominent in the iterative calculation,and become the study focus of the scholars.Consumers often feel very confused when they face many choice in the era of information explosion.At the same time,the content producers are also looking for suitable consumers,recommendation system is the best tool to solve the contradiction.The recommendation system offers the information or items to the user by analysis useful information from the massive data of user behavior.The recommendation algorithm is an important part of the recommendation system,the quality of the recommendation system is decided by the recommendation algorithm.The traditional computer is time-consuming,and it cannot meet the needs of real-time recommendations of present enterprise.However,this problem can be solved by using the distributed computing platform.Besides,due to the Spark is based on the memory operation,the need of multiple iteration computations in the implementation of recommendation algorithm can be well satisfied.In this paper,the recommendation algorithm in the recommendation system are studied by using the platform of Spark,including the following four aspects:(1)In view of the scalability problem in recommendation system,in this paper,the Spark platform is used to achieve recommendation algorithm parallelization.In order to prove the computing performance based on the Spark platform is more prominent,The computational performance of the recommendation algorithm is implemented on the Spark platform and the computational performance of the recommendation algorithm based on the Mapreduce Hadoop platform is compared.(2)Aiming at the problem of data sparsely in recommendation system,the ALS(Alternating Least Squares)algorithm is proposed.(3)To solve the problem that ALS algorithm has too many iterations and too long convergence time,an ALS-NCG(Alternating Least Squares-Nonlinear Conjugate Gradient)algorithm is proposed to improve the ALS recommendation algorithm.(4)The parallelization of ALS-NCG recommendation algorithm is realized in Spark platform.The experimental results show that the performance of Spark is better than Hadoop Mapreduce in the parallelization of the recommendation algorithm which needs many iterations.The improved ALS-NCG algorithm is not only better than the ALS algorithm,the prediction accuracy of the recommendation system is also improved.
Keywords/Search Tags:Spark, Big Data, Collaborative Filtering Recommendation, Least squares, Recommendation System
PDF Full Text Request
Related items