Font Size: a A A

Recommendation Algorithms In The Big Data Era

Posted on:2015-01-25Degree:MasterType:Thesis
Country:ChinaCandidate:Y S SunFull Text:PDF
GTID:2268330428961565Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet and information technology, people are facing massive information that beyond the capability of any individuals, which is also called the information overload problem. Recommender system is an effective tool to solve the problem of information overload, which makes personalized recommendation based on historical user behaviors. Collaborative filtering is the most widely used and most successful recommendation technology. But the performance of collaborative filtering technology relies on an accurate measure of similarity. Furthermore, it can not be scaled to larger data. To address the above problems, this work proposes recommendation algorithm based item hierarchy, and paralleled matrix factorization on Hadoop.Major contributions of this thesis include:1)Modify the cosine and Pearson similarity to avoid unnecessary similarity computation. An inverted index data structure is introduced to reduce computational complexity for finding K-nearest neighbors. Results show that the modified formula can significantly reduce the time for finding the nearest neighbors, improving the ability to handle big data.2)Propose a collaborative filtering based on item hierarchy. First, automatically extend the item hierarchy taking advantage of item partial label and categories for building full hierarchical structure for all items, then compute the similarity between items using the hierarchy. Results show that the algorithm can prominently improve the ability to handle big data, and get better RMSE value, compared with the traditional collaborative filtering.3)Implement distributed matrix factorization with MapReduce distributed computing framework. The main module of matrix factorization is matrix multiplication. Therefore, the paper studies the distributed implementation of matrix multiplication, analyzing inner product, outer product and block method. Experiments imply that the efficiency of matrix multiplication have been increased.
Keywords/Search Tags:Recommendation System, Collaborative Filtering, Matrix Factorization
PDF Full Text Request
Related items