Font Size: a A A

The Improvement And Implementation Of Collaborative Filtering Recommendation Algorithm Based On Hadoop

Posted on:2016-02-01Degree:MasterType:Thesis
Country:ChinaCandidate:Z B WuFull Text:PDF
GTID:2308330503978052Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Collaborative filtering (Collaborate Filtering, CF) is the most widely and successful used personalized recommendation technology. But collaborative filtering still faces some problems, such as data sparsity, poor scalability, can not quickly find the mutation of users’ interest and so on. To solve these problems, this thesis presents a kind of hybrid recommendation algorithm based on collaborative filtering.The main work of this thesis is as follows:1) To improve the CF algorithm based on itemt clustering. The data base of traditional collaborative filtering algorithm can be described as a user-item rating matrix. Through item clustering technology, we can convert this matrix into a user-class rating matrix. The process may reduce the sparseness of the matrix data to improve the calculation accuracy and reduce the size of the matrix to improve the scalability of the algorithm.2) This thesis introduces a concept of item-class matrix. This matrix indicates items’ different degree of representative in their class. Adding the information of items’properties and popularity when computing the matrix can calculate the difference of items more accurately, thus we can eliminate the effects of such differences so as to improve the recommendation accuracy.3) In order to reduce the impact of user activity on the result of recommendation, we design a user activity factor, which can weight users’actions depending on their activity factor.4) The introduction of the online recommendation module based on user interest model. This module weights users’ historical behaviors according to the time factor, thus the algerithm could quickly find the interest mutation of users, which rightly is the shortcoming of collaborative filtering. In addition, the collaborative filtering algorithm based on user-class matrix can only recommend items in the class which he did not know, but it can not recommend items in the class that the user has known, so this algorithm will lose some ability of exploring more items. The adding of the online recommendation module based on user interest model also is to alleviate this problem.5) This thesis divids data processing into offline and online process, which can reduce the calculation amount of data processing, and we also run the algorithm on hadoop distributed platforms so as to allocate more resources for the algorithms, to further improve the algorithm scalability.Finally, the theoretical analysis and experiments show that the hybrid recommendation algorithm designed in this paper can achieve good results on both the scalability and accuracy.
Keywords/Search Tags:personalized recommendation, collaborative filtering, user interest model, Hadoop, HBase
PDF Full Text Request
Related items