Font Size: a A A

Study Of Collaborative Filtering Recommendation Algorithm On Hadoop

Posted on:2018-02-05Degree:MasterType:Thesis
Country:ChinaCandidate:H F TangFull Text:PDF
GTID:2348330518456581Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the advent of the era of the Big Data,people share the benefits of Big Data at the same time,but also suffer the troubles that it brings.All relevant information may be overwhelming when you look for some information.Facing these massive data,people always unable to quickly locate the information that they want,and user needs to spend a lot of time and effort to identify the validity and usability of the information.The efficiency of information using is rapidly declining as the data grows,which is the famous problem called information overload.Although Google,Baidu and other search engines offers some help,it still fail to solve people's demand of personalized information.After the emergence of e-commerce(Amazon,Taobao,Jingdong,etc.)and social network(Twitter,Weibo,etc.),people's demand for personalized information is more intense.Then,how to help people quickly find the information they interested and satisfied in the context of Big Data,which has become a hot topic in academia and business community.In order to solve these social demands,researchers proposed personalized recommendation systems,which is a intelligence system that provides users with the information that they may interested by excavating user's history data.Whether the personalized recommendation system can provide users with satisfactory service depends on a Personalized recommendation algorithm,a good algorithm can recommend a good result.In many personalized recommendation algorithm,one of the most successful strategy is the collaborative filtering algorithm.Although collaborative filtering algorithm has obtained good results,it still has many drawbacks,such as data sparseness,scalability and cold start problem.In order to further improve the efficiency of personalized recommendation,on the basis of extensive reading of relevant literature and in-depth study on collaborative filtering technology,the paper better the existing collaborative filtering recommendation algorithms,and proposes collaborative filtering algorithm basing on calculating user's similarities through user rating differences and item clustering grading prediction,and the proposed algorithm is implemented on the Hadoop platform.The specific research contents are as follows:(1)the paper proposed a method that is based on user rating differences,this method takes score differences between users,score preferences and common rating items into consideration.This method excavate and apply the information that user ratings,it apply the user ratings information under the beneath mean value especially,this method effectively improve the accuracy of similarity of user,and alleviate the decline of recommendation quality caused by data sparsity.(2)to improve the traditional score prediction method that based on nearest neighbor,the paper proposed a method that is based on item clustering scoring prediction,which implements scoring predicts for non scoring items.This method have two core concepts,one is taking neighbors have more score values for non scoring items into consideration,and selects the maximum score value among them as the final score value;the other is use the item weighting factor and user similarity as the weight value to adjust the weight that each different predict items should possess in the particular user.This method can effectively improve the accuracy of scoring prediction,and improve the quality of recommendation.(3)the paper will implement collaborative filtering recommendation algorithm that user similarity calculate method base on user rating differences and grading prediction based on item clustering on the Hadoop platform,that is the distributed computation of collaborative filtering algorithm is implemented by using MapReduce computational model,and the time-consuming calculation process in the collaborative filtering algorithm in an off-line way,and the non time-consuming process in an on-line way.This can not only solve the scalability of the algorithm,but also solve the real-time problem of information recommendation under the condition of massive data to a certain extent.(4)In this paper,the proposed personalized collaborative algorithm is tested on the movie data set provided by Movie Lens,and the results show that the proposed method in this essay is superior to several existing methods in recommendation effect.
Keywords/Search Tags:collaborative filtering recommendation algorithm, Hadoop, user rating differences, user similarity, item grading prediction
PDF Full Text Request
Related items