Font Size: a A A

Research Of Incremental Collaborative Filtering Algorithm Based On Apache Flink

Posted on:2020-02-21Degree:MasterType:Thesis
Country:ChinaCandidate:Z Y LiuFull Text:PDF
GTID:2428330620451110Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet and the sharp increase of information resources,it is more and more difficult for users to find useful information from massive data.In order to help users quickly find useful information,the personalized recommendation system has been widely studied.Collaborative Filtering(CF)is one of the most famous methods for constructing recommendation systems,and the Matrix Factorization(MF)model is a widely used collaborative filtering algorithm model.The recommendation algorithm based on Matrix Factorization can achieve good results in the field of recommendation systems.However,it is more difficult to adapt to rapidly changing real-world data.In the real world,user behaviors change rapidly,and when these incremental data appear,the static recommendation model does not adapt well.In view of the above problems,the main research work of this paper is as follows:This paper proposes an online-and-offline Collaborative Filtering method to improve the traditional CF method,called online version of Stochastic Gradient Descent(SGD)algorithm with offline knowledge,the name of which is Online SGD with Offline Knowledge(OSGDO).Aiming at the problem of incremental collaborative filtering method in streaming environment,this paper proposes a real-time incremental recommendation framework based on Apache Flink: Realtime Incremental Recommendation Framework(RIRF).This framework allows the incremental collaborative filtering algorithm proposed in this paper to handle the incremental recommendation problem in the streaming data environment.For the problem of incremental update recommendation model,this paper proposes an online-and-offline Collaborative Filtering method to improve the traditional CF method,called online version of Stochastic Gradient Descent(SGD)algorithm with offline knowledge: Online SGD with Offline Knowledge(OSGDO).It can quickly update the recommended model while processing incremental data to better make recommendations.Aiming at the integrated learning problem in the stream processing environment,this paper proposes two novel online bagging incremental recommendation algorithms based on the incremental Funk SVD(Singular Value Decomposition)and the incremental Bias SVD algorithm combined with the universal Online Bagging mechanism: Online Bagging Funk SVD(OBFSVD)and Online Bagging Bias SVD(OBBSVD).To a certain extent,they can reduce the variance of the recommendation results predicted by the recommendation model.In this paper,three proposed methods are implemented on the proposed RIRF framework: OSGDO,OBBSVD and OBFSVD.When incremental data arrive,the proposed methods can learn new data incrementally and update the recommended model synchronously.The experimental results show that the proposed methods are more effective than rebuilding the recommendation model on all data.At the same time,these algorithms perform well in practice and achieve impressive accuracy when it is tested with the well-known data sets of MovieLens and Netflix.
Keywords/Search Tags:Recommendation System, Apache Flink, Collaborative Filtering, Matrix Factorization, Ensemble learning, Incremental learning
PDF Full Text Request
Related items