Font Size: a A A

Research On Hierarchical Collaborative Filtering Algorithm With Spark Platform

Posted on:2017-05-01Degree:MasterType:Thesis
Country:ChinaCandidate:X Y JiangFull Text:PDF
GTID:2428330596957386Subject:Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet and information technology,amount of information in the network is also growing.However in face of the massive data,it's difficult to acquire information that interests them.The emergence of recommendation system has become a good solution to the information overload.It's according to user's historical behavior information and effective recommendation algorithm initiatively push personalized information for them.The recommendation algorithm is particularly important and directly relate to the performance of the recommendation system.Collaborative filtering algorithm is one of the most effective recommendation algorithms.It has great advantages in personalization,persistence and automation.However,with the expansion of the scale of data,collaborative filtering algorithm faces the significant problem of data sparsity and scalability.To improve the precision of collaborative algorithm in high dimensional sparse data situations,the paper proposes a hierarchical co-clustering collaborative filtering algorithm,referred to as AHCCF.Consider the scores are more intensive in the cluster,so similarity between the items are more actual.Therefore,AHCCF algorithm use the co-clustering divide the data sets into partitioned matrix.According to the partitioned results calculate item similarity in the user clusters.Besides,calculating the matrix of the score density.And then acquire the weight of user cluster by the analytic hierarchy process?Finally calculate the final similarity between items.Therefore,it can effectively alleviate the impact of data sparsity,and improve the recommended quality of the algorithm.In order to improve scalability of the collaborative filtering algorithm,the paper parallelize the AHCCF algorithm on Spark platform.And then,improve the scalability and the efficiency of the AHCCF algorithm.From the experiment in different scale data sets show that the AHCCF algorithm can significantly improve the accuracy of the recommendation,and the AHCCF algorithm can obtain better recommendation efficiency and scalability in distributed Spark environment.
Keywords/Search Tags:recommendation, collaborative filtering, spark, co-clustering, AHP
PDF Full Text Request
Related items