Font Size: a A A

Improvement And Implementation Of Collaborative Filtering Algorithm Based On Spark

Posted on:2019-03-04Degree:MasterType:Thesis
Country:ChinaCandidate:R ChenFull Text:PDF
GTID:2428330590465722Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of technologies such as Mobile Internet,Internet of Things and Cloud Computing,the global data volume has exploded,and the era of big data has come."Information overload" is currently the main problem that people face,this problem is frequently encountered in the areas of e-commerce,music videos,and news.The personalized recommendation engine is a means of information filtering and has important application and research value in solving information overload problems.Recommendation technology based on collaborative filtering is the most commonly used by recommendation technology for personalized recommendation.However,traditional collaborative filtering algorithms also have some problems.In order to overcome these problems,improved collaborative filtering algorithm is proposed in this thesis.At the same time,in order to realize the processing of massive data,it needs a recommendation system in conjunction with a big data processing framework.Spark is a new generation of computing framework,and it is well-suited for iterative calculation and stream processing.Therefore,Spark is properly used as a computingframework for the recommendation system.Firstly,in view of performance flaws in the case of sparse data and the lack of similarity measurement methods in traditional collaborative filtering recommendation algorithm,a collaborative filtering recommendation algorithm based on multi-level mixed similarity is proposed to improve the recommendation accuracy in this thesis.The three levels of user's rating similarity,user's interest similarity and user's feature similarity are fused tomeasure similarity between users.Meanwhile,the weight is dynamically adjusted by the number of user comments,and improved recommendation strategy is proposed.The experimental results show that the improved algorithm improves the user's recommendation accuracy and effectively mitigates the impact of these problems.Secondly,different recommendation services for new and old usersare provided by the system.Improved algorithm and ALS algorithm are used by old user to implement recommendation;different recommendation strategies based on the information provided by the user are used by new user to implement recommendation,and it solves the cold start problem of the algorithm itself.In order to better integrate with the Spark platform,these recommendation strategies are implemented in parallel and a recommendation system based on Spark is designed.Finnally,for the recommendation system,it is very important to provide recommendation service faster and better.To emulate the real-time recommendation process,the Kafka cluster is regarded as a message producer to produce simple user information,and the Spark Streaming stream processing framework is regarded as the consumers of the messages to provide users with real-time recommendation services in this thesis.The simulation experiments show that the system recommendation module has real-time performance.
Keywords/Search Tags:collaborative filtering algorithm, Spark, mixed similarity, recommended service, real-time
PDF Full Text Request
Related items