Font Size: a A A

Research Of Parallel Collaborative Filtering Recommendation Algorithm Based On Hadoop

Posted on:2017-03-21Degree:MasterType:Thesis
Country:ChinaCandidate:C LiFull Text:PDF
GTID:2308330485980613Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet, the phenomenon of information overload is becoming more and more obvious. And as an important information filtering tool,recommendation system arises. Collaborative filtering recommendation algorithm is the most widely used algorithm in practical system. But with the substantial increase in number of users and items in recommendation system, the proportion of rated items is getting smaller and smaller, which results in the sparseness of user-item rating matrix, and reduces the recommendation accuracy of traditional collaborative filtering recommendation algorithm. At the same time, current research on collaborative filtering recommendation method mostly focuses on the design and optimization of the algorithm for single machine. With continuous expansion of the scale of recommendation system, most of traditional recommendation algorithms will encounter serious computational bottleneck. Therefore, it is necessary to use parallel method to improve collaborative filtering recommendation algorithm to deal with large scale data.In order to solve problem of data sparseness and scalability of data set in collaborative filtering algorithm, a collaborative filtering algorithm based on IALM and filling credibility was firstly presented. Based on this, with the deep study of Hadoop HDFS distributed systems and MapReduce programming idea, a MapReduce parallel algorithm was designed and implemented. At last, a prototype system of movie recommendation based on Hadoop was designed and implemented. The main research work and achievements are as follows:(1)An improved collaborative filtering algorithm based on IALM and filling credibility was proposed. To solve problem of data sparseness, IALM algorithm was used to fill the sparse user-item rating matrix. Based on this, the concept of filling credibility was put forward. Considered that users’ interest will change over time, filling credibility and exponential forgetting function were integrated to correct filling matrix. And then, a collaborative filtering algorithm based on IALM and filling credibility was presented to alleviate the problem that the traditional collaborative filtering algorithm has lower recommendation accuracy in the case of sparse data. Finally, experimental results show that compared with traditional collaborative filtering algorithm, the mean absolute error of theproposed algorithm is reduced by 10.98% when the number of neighbors is ten. So the algorithm can significantly improve the quality of recommendation in the case of sparse data.(2)The MapReduce parallel method of collaborative filtering algorithm based on IALM and filling credibility was designed. To solve the scalability problem of collaborative filtering algorithm, with the study of MapReduce programming model, the algorithm was divided into seven MapReduce job flow, so that distributed computing can carry out in Hadoop platform.Finally, experimental results show that compared with single node, the computing time of 3Hadoop node clusters can be reduced by about 66.14% on the datasets of MovieLens-10 M,proving that the algorithm implemented in Hadoop platform can effectively improve the scalability of recommendation system.(3)A movie recommendation prototype system based on Hadoop was built. After system requirements analysis, system design and implementation, with the combination of MapReduce parallel collaborative filtering algorithm based on IALM and filling credibility,Hadoop, MATLAB, etc, on a Hadoop cluster consisting of multiple computers, a movie recommendation system based on Hadoop was designed and implemented.
Keywords/Search Tags:recommendation system, collaborative filtering, filling credibility, parallelization
PDF Full Text Request
Related items