Font Size: a A A

Design And Implementation Of A Hybrid Features-based Collaborative Filtering Recommender System Under Spark

Posted on:2019-01-13Degree:MasterType:Thesis
Country:ChinaCandidate:C PanFull Text:PDF
GTID:2428330590975362Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development ofthe Internet and Big Data,Recommender System has become an important information filtering mechanism.Collaborative filtering algorithm(CF)is one of the most important and popular technology in this area.However,traditional CF still suffers from problems as data sparsity,poor scalability,which result in recommendation inaccuracy and poor real-time ability.To alleviate these problems,this thesis fully explores various features in user behaviors and items to propose two recommendations methods based on collaborative filtering ideas,then designs and implements a recommender system based on Spark platform.The main research work of this thesis is summerized as follows:(1)Propose a CF algorithm Combined with User Trust.This approach introduces concepts of user activity and reliability based on the theory of trust modelin social network to build user trust relationship model,and a dynamic adaptive weight is used to integrate the trust relationship and similarity relationship among users as a recommend weight to select the nearest neighbors,this process can alleviate data sparsity,then a tree-based clustering scheme is designed to reduce the scale of on-line operation data and improve computing efficiency.(2)Propose a CF algorithm based on the interest points.This approach divides items into different interests,fully explores the relationship among users,interests and items.A distinct feature of this approach is that it finds nearest neighbors based on the preference of interest points between users rather than corated items to alleviate the problem of data sparsity,then combined with the prediction of Latent Factor Model,the general factor and the individual factorare both considered to give the final recommendation results to improve the accuracy of recommendation.(3)Based on the above ideas,this thesis designs and implements a recommender system based on Spark,which is complete,flexible,configurable and is suitable for big data environment.This system consists of five modules,which are data warehouse module,off-line computing module,recommendation engine module,configuration parsing module and configuration module,also it focuses on the scalability and maintainability and achieves the high cohesion and low coupling which is convenient for futher development.At last,the experimental results show that this recommender system achieves a good performance on both scalability and accuracy.The results show that compared with traditional CF,the precision of(1)and(2)is 18.7% and 9.5% higher respectively,the recall is 11.3% and 6.2% higher respectively.In terms of computing performance,the average time consuming of(1)and(2)is 1.78 seconds and 0.828 seconds respectively,which is much lower than 9.52 seconds of the traditional CF,in addition,increasing the computing nodes can get higher speedup to further improve computing efficiency,thuseffectively alleviate the problem of poor scalability.
Keywords/Search Tags:Recommender System, Collaborative Filtering, Trust Model, Interests Model, Spark
PDF Full Text Request
Related items