Font Size: a A A

A Hybrid Recommendation Algorithm Based On MapReduce

Posted on:2018-01-22Degree:MasterType:Thesis
Country:ChinaCandidate:Y Q JiangFull Text:PDF
GTID:2348330536959567Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the explosive growth in the amount of information and complexity of data,information overload has recently become a serious problem in web information environment,especially in the e-commerce service field.Recommender systems have provided many effective approaches to solve this overload problem among which collaborative filtering algorithm is the most widely used recommendation algorithm in practical application.However,these algorithms have obvious bottlenecks with increasing scale of recommender system.Moreover,extra-large data computing may cause decreasing in the accuracy of the recommendation results,and may even lead recommendation computing come to the end of failure.Hence,the traditional collaborative filtering algorithms need to be reformed.This paper presented and implemented an improved collaborative filtering recommendation algorithm based on the MapReduce parallel computing framework.Firstly,the algorithm was parallelized by MapReduce framework.For the item-based collaborative filtering algorithm,the covariance matrix was used to replace the similarity matrix to reduce the consuming time of calculating the similarity matrix.When calculating the recommendation results,using the Top-N method selected the nearest neighbor to reduce the computing quantity of calculation algorithm.For the user-based collaborative filtering algorithm,the experiment data was grouped by using clusters.For each group of data,the algorithm took the users of same group as neighbors to calculate the second recommend result while the cluster center of each group was used as neighbors to calculate the third recommend result.Then the algorithm finished modeling the data by using the linear regression while set the three recommend results as training data,and the actual rating was set as output data.After defining the loss function of the model,the algorithm would find the optimal mixing ratio by the gradient descent method.The final research result showed that this new collaborative filtering recommendation algorithm based on the MapReduce parallel computing framework could largely improve the accuracy of recommendation results comparing to the traditional approach while advance on the extensibility of algorithm parallelization.In this way,the improved algorithm was more capable of coping with data of an unprecedented scale.
Keywords/Search Tags:Collaborative filtering, Hybrid recommend algorithm, Clustering, Linear regression, Hadoop, MapReduce
PDF Full Text Request
Related items