Font Size: a A A

Design And Parallel Implementation As Well As Application Of Collaborative Filtering Algorithm

Posted on:2020-10-09Degree:MasterType:Thesis
Country:ChinaCandidate:J Y LuFull Text:PDF
GTID:2428330590496007Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Collaborative filtering algorithm is widely used in recommendation system,but in some practical application scenarios,it ignores the common preference implied in user ratings on items and the influence of the difference of the mean score between items on the eventual similarity between itemsFirstly,in order to solve the above problems,this thesis improves the traditional similarity calculation formula.Then,in order to improve the efficiency of item-based collaborative filtering recommendation algorithm,aiming at problem of long neighbor search time of traditional collaborative filtering recommendation algorithm,a collaborative filtering recommendation algorithm based on clustering and similarity is designed by introducing clustering algorithm to narrow the range of nearest neighbor set,which is named CS-CF.In order to further improve the real-time and scalability of the recommendation system,a parallelization scheme of CS-CF algorithm on Spark platform is designed by taking the advantages of the mainstream large data platform Spark in Iterative computing and memory computing.This scheme realizes Parallelization of similarity calculation between items and scoring calculation through making rational use of the characteristics of RDD parallel computing,the caching mechanism of RDD and the broadcast variables in Spark.Finally,the performance of CS-CF parallel algorithm is tested on MovieLens public dataset;a prototype system of movie recommendation is developed,and CS-CF algorithm is applied in the system to verify the usability of the research results.The results of experiment and application show that the collaborative filtering recommendation algorithm based on clustering and similarity CS-CF as well as its parallelization scheme on Spark has good performance in accuracy,usability and timeliness.
Keywords/Search Tags:Recommendation system, Item-based collaborative filtering, Spark platform, Parallelization
PDF Full Text Request
Related items