Font Size: a A A

Research Of Collaborative Filtering Recommendation Algorithm Based On Hadoop

Posted on:2019-04-05Degree:MasterType:Thesis
Country:ChinaCandidate:N N GuoFull Text:PDF
GTID:2428330593451688Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
The rapid development of Internet technology brings convenience to all walks of production and life,but also produces a large amount of information redundancy.Internet users are facing serious information overload phenomenon.The search engine for the purpose of information retrieval,to some extent,alleviated the problem of information overload but has been unable to meet people's personalized needs about information.With the development of the surge in the amount of data and the recommendation system,the major social networks,operators and e-commerce have launched their own personalized recommendation products,which has brought the gospel to those who have "choice difficulty".Nowadays,people are pursuing more personalized and faster quality services,therefore,a recommendation strategy that can be applied in the big data state has become an important solution to alleviate the information overload and improve the accuracy of data.This paper,a collaborative filtering recommendation algorithm integrating social networks focuses on improving the data sparsity and cold start in recommender systems,and improves the accuracy of recommendation and the efficiency of the algorithm.And then,this algorithm is implemented parallel computing on the Hadoop platform.Firstly,the algorithm divides the original data into the corresponding score matrix,then the score matrix and social trust matrix are decomposed into low dimensional feature matrix,user character dimension matrix,commodity character matrix,user trust matrix and trust matrix.The optimal feature matrix is generated by the update iteration of the loss function of the algorithm by gradient descent.Then,the vacancy data in the scoring matrix is filled with the weighted feature matrix,which could solve the problem of data sparsity and cold start.Finally,we sorted the forecasted data according to a user's dimension,and recommended the top N products to the users.In order to verify the accuracy and effectiveness of the algorithm,we use the Epninion dataset and the five cross validation method to verify the error value of the data prediction,and the accuracy,recall rate and comprehensive evaluation of the two methods.In order to improve the scalability and recommendation efficiency of recommender systems,we use Hadoop platform to analyze the above algorithms and implement them with MapReduce code.In the meanwhile,the efficiency of the algorithm,reliability and scalability of the recommendation system are verified by speedup and F1 values in parallelization relative to the stand-alone mode.Experimental results show that the collaborative filtering algorithm proposed in this paper can reduce the sparsity of data,and the average absolute error and root mean square error of prediction score are generally lower,and the recommendation accuracy of the algorithm is higher.The social network recommendation algorithm implemented by parallelization improves the scalability of the whole recommendation system and reduces the execution time of the algorithm compared with the stand-alone model.
Keywords/Search Tags:Hadoop, Collaborative Filtering, Parallelization, Rating Matrix, Social Matrix
PDF Full Text Request
Related items