Font Size: a A A

Research On Clustering Collaborative Filtering Recommendation Algorithm Based On MapReduce

Posted on:2022-03-21Degree:MasterType:Thesis
Country:ChinaCandidate:X Y HuFull Text:PDF
GTID:2518306521981959Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
The increasing popularity of the Internet and the continuous improvement of data storage technology and means provide a strong support for the acquisition and storage of data,but the blowout growth of data is followed by a gradually serious problem of information overload.An efficient and accurate recommendation system can effectively overcome the obstacles of information retrieval when consumers face mass data and can bring some commercial value for merchants.An excellent recommendation system needs reliable recommendation algorithm support.Collaborative filtering algorithm,as one of the most important algorithms in the recommendation system,has become the mainstream direction of research at present.Classified by different objects,the collaborative filtering algorithm,it can be divided into two kinds commonly based algorithm,the first is the user-based collaborative filtering algorithm,the second is the collaborative filtering algorithm based on item,the core values of the two algorithms have different meanings: the former is "people of one kind come together",is to point to a user recommend people with similar preferences,choice of items;The latter is "things of one kind come together",which refers to the recommendation of an item to users who have selected similar items with the item,in which the measure of similarity needs to be obtained through similarity calculation.However,when the amount of data is large,the operation cost of this algorithm will increase greatly.This paper has made some improvements.The corresponding algorithm adopted is collaborative filtering algorithm based on user clustering.The main content of this paper is as follows:1.The background,research status and application of clustering and recommendation systems at home and abroad are introduced.2.Relevant contents of K-means clustering technology,Canopy algorithm and collaborative filtering recommendation algorithm were described in detail.3.The improvement of collaborative filtering in this paper is proposed,and the concept of clustering is introduced to delimit the search scope of the target user's nearest neighbor in the clustering result.But for K-means clustering algorithm,which has some shortcomings,mainly includes the algorithm of the center is relatively random choice way,and the algorithm of K value on the set is relatively fuzzy,etc.,so in this article,will take use of Canopy clustering algorithm,in order to improve inaccuracy of offset algorithm,The Canopy clustering algorithm here is actually a predictive algorithm.Its core function is to perform pre-clustering on data,and the relevant value obtained by this algorithm is then regarded as the initial value of k-means.The experimental results show that the accuracy of this method is further improved compared with K-means clustering algorithm.4.For large-scale data sets,Map Reduce computing framework is adopted to implement the algorithm,and Movie Lens data set is taken as an example for application.In this paper,Canopy K-Means collaborative filtering algorithm was implemented through Map Reduce,and film recommendation was taken as an example to make recommendation prediction.From the perspective of accuracy of grading prediction and recommendation results,the improved clustering collaborative filtering results in this paper have improved all evaluation indexes to a certain extent compared with previous studies.
Keywords/Search Tags:Canopy-K-means, clustering, collaborative filtering, MapReduce
PDF Full Text Request
Related items