Font Size: a A A

Research And Application Of Recommendation Algorithm Based On Clustering

Posted on:2019-02-23Degree:MasterType:Thesis
Country:ChinaCandidate:J LiFull Text:PDF
GTID:2428330566999342Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet technology,the amount of data carrying information increases exponentially.In the face of such huge volume of data,users tend to feel helpless,and it will become more difficult and time-consuming to locate the information they need.So the recommendation system emerges as the times require.It can provide personalized recommendation service for users according to the information of different users' historical habits,preferences and so on.The service can recommend items of interest to the user forwardly,not only improve the user's experience,but also improve the sense of belonging of the user.The recommendation algorithm is the most important part of the recommendation technology.Among the numerous recommendation algorithms,the recommendation algorithms based on collaborative filtering are the most widely used.Data mining is the technology for discovering potential laws in a large number of data.Applying data mining algorithms to recommendation systems can help us to improve the efficiency of recommendation.This thesis focuses on the research and application of collaborative filtering recommendation algorithm based on clustering.Firstly,for the nearest neighbor set in UserCF(the user based collaborative filtering)algorithm being calculated based on the characteristics of global data nodes,the clustering algorithm is introduced to divide user groups,so that the computation of neighbor sets can be reduced to the same cluster set.In order to improve the accuracy of the clustering,the K-means algorithm is improved and the K-means algorithm based on the minimum spanning tree which is called MST-K is designed.The algorithm selects the initial cluster centers with the minimum spanning tree which can avoid the adverse effects of the initial cluster centers random selection on clustering results,and it calculates the similarity through cosine similarity which can solve the problem of "similar differences".The time efficiency of MST-K algorithm is improved by further parallelizing it based on the Spark platform.Secondly,the feature attribute of the user is introduced into the scoring matrix of the UserCF algorithm to reduce the sparsity of the initial score matrix data so as to improve the quality of the recommendation.In this way,the UserCF algorithm fused with MST-K which is called M-UserCF is formed.And the parallel design and implementation of the M-UserCF algorithm based on Spark platform is carried out,and the performance of the algorithm is tested.Finally,the M-UserCF algorithm is applied to the recommendation of tourist routes,a "tour route" recommendation prototype system is developed and also the application results are given.The test and application results based on Spark platform show that parallel MST-K algorithm and parallel M-UserCF algorithm have good accuracy and timeliness on large datasets.
Keywords/Search Tags:Collaborative filtering, Recommendation system, Minimum spanning tree, Clustering, Spark
PDF Full Text Request
Related items