Research On Dynamic Recommendation Parallelization Algorithm Based On Clustering

Posted on:2018-04-15

Degree:Master

Type:Thesis

Country:China

Candidate:L L Li

Full Text:PDF

GTID:2358330515451393

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

At present,the amount of data is dramatically increasing with the popularity of Internet and mobile devices,and vast information has led to serious information overload.The problems that how to analyze the user's interest quickly in abundant information and recommend the interested message to the users become a hot issue in the current research.As one of effective ways to solve this problem,collaborative filtering recommendation algorithm can realize personalized recommendation by establishing model about preference information and historical data of users.However,with the increase of the data size,the data sparsity,real-time,accuracy and other issues are more and more fearful,which leads to a significant decrease in the recommendation's quality of the Slope one algorithm.Focused on the problem of low accuracy,high computational complexity and slow running speed,following work has done in this thesis:(1)This thesis analyzed and summarized concepts and flows about collaborative filtering recommendation algorithm,measures of similarity and clustering algorithm.In addition,the architecture,workflow and building process about Hadoop platform and Spark framework are introduced.(2)The SBTICK-means parallel clustering algorithm based on Spark framework is proposed.Firstly,it is preprocessed by Canopy.And then during K-means iterative calculation,redundant computation is reduced and clustering speed is accelerated by the triangle inequality theorem.Experimental results show that the proposed algorithm improve clustering efficiency while ensuring the accuracy rate,and the size-up rate,scale-up rate and operating speed are also increased.(3)The weighted Slope One algorithm based on clustering and Spark framework is put forward.Firstly,the traditional rating similarity is included into the time weight,and dynamic change of user's interest over time is reflected.Secondly,comprehensive similarity is computed by introducing item attribute.And the set of nearest neighbor is generated through using the presupposed SBTICK-means algorithm.Finally,combining with the time decay function,the rating prediction and recommendation are realized.Experimental results show that the improved algorithm is more accurate than the traditional Slope One algorithm and Slope One based on user similarity,which can improve the running efficiency compared with the Hadoop platform.In summary,this thesis begins with both the basic idea and deficiency of Slope One algorithm.And its accuracy of predicted rating,the real-time performance and scalability are optimized.Eventually,the whole parallelization is realized by combining with the Spark framework.The work in this thesis significantly enhances the accuracy and efficiency of clustering and recommendation.Moreover,it has some research value and practical significance for further studying on that of massive data.

Keywords/Search Tags:

Slope One, Clustering, Spark, Time Weight, Item Attribute

PDF Full Text Request

Related items

1	Improvement And Implementation Of Slope One Collaborative Recommendation Algorithm Based On Spark
2	Research On The Weighted Slope One Recommandation Technology Based On Clustering
3	Research On Recommendation Algorithms Based On Probability Matrix Factorization Integrating Time Factor And Item Clustering
4	Research And Implementation Of Parallel Recommandation Algorithm Based On Spark
5	Research On Personalized Recommendation Algorithms Based On Clustering And Collaborative Filtering
6	Research Of Weighted Slope One Algorithm Based On Clustering
7	Research Of Collaborative Filtering Recommender Algorithm Based On Time Weight
8	Research On Overlapping Clustering And Attribute Graph Clustering Algorithms
9	Application Of Clustering Algorithm Based On Attribute Weighting In Bank Customer Segmentation
10	A Collaboration Filtering Recommendation Algorithm Based On Fusing User Rating And Item Attribute