Font Size: a A A

Research On Improved Distributed Collaborative Filtering Recommendation Algorithm

Posted on:2016-02-03Degree:MasterType:Thesis
Country:ChinaCandidate:B W WuFull Text:PDF
GTID:2348330488982008Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
At present, the Internet has been fully integrated into people's daily life. While people's diverse network activity has produced a lot of data, the problem of information overload appearing. Recommended system came into being in this case, which use a series of effective rules and algorithms and then generate personalized recommendations for users intelligently. Specifically, collaborative filtering algorithm as one of the most successful and most widely used algorithms in recommendation system, which has been widely studied. However, because of the influence of largely expanding resources and user data volume and serious data sparseness, the traditional collaborative filtering recommendation algorithm is faced with many problems, such as low recommendation accuracy, high computational complexity, and poor scalability et al in big data sets.In order to solve the problems restricting the development of recommendation systems, we take how to address these issues as a research focus in this paper,the main work is as follows:1) Based on basic studies of user-based collaborative filtering, we proposed a novel distributed recommendation algorithm on improved Minhash algorithm. Minhash has the advantages of high efficiency and parallel computing, we consider the difference between user ratings and extract multi-valued information of user interest preferences, at last, we proposed an improved multi-dimension Minhash algorithm to measure similarity between users.2) Because of the impact of the data sparsity, very little common user ratings data lead to inaccurate similarity between items. To address this issue, firstly, we find those item pairs which cannot be accurately calculated the similarity according to the traditional methods by built item's co-occurrence matrix and pre-set threshold. Then we proposed another novel and effective way to replace the traditional methods to improve similarity calculation quality between items by learning the thought of relationship delivery from social network.3) We put the above two improved collaborative filtering algorithm running on the MapReduce framework to improve the accuracy scalability of the algorithm, and ultimately alleviate the performance bottleneck of recommendation algorithms due to large volumes of data high computational complexity caused. Compared to traditional collaborative filtering recommendation algorithm based on single node, both of the two proposed algorithm achieve a better balance between recommendation accuracy and scalability.
Keywords/Search Tags:Collaborative filtering, minhash, data sparsity, co-occurrence matrix, relations delivery, distributed computing
PDF Full Text Request
Related items