Font Size: a A A

Research And Implementation Of Real Time Stream Computing Recommendation System Based On Spark Platform

Posted on:2017-02-10Degree:MasterType:Thesis
Country:ChinaCandidate:X D ZhangFull Text:PDF
GTID:2308330503964126Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Users in the face of massive information can not get the real useful information for themselves, resulting in a reduction in the use efficiency of information, which is the information overload problem. Recommendation system is a very effective way to solve the information overload problem, which recommends the user interested information or goods to the user based on the user’s information, interest, etc.However, most of the existing recommendation systems update the results based on regular calculation, which is not accurate enough, which is the real-time problem of the recommendation system. In addition, due to the lack of necessary data for new users or new goods, it can not provide the recommendation, this is the so-called cold start problem.Regarding the problems above, this paper aims to solve the cold start problem of the recommendation algorithm and the real-time problem of the recommendation system, design and implement a real-time stream computing intelligent recommendation system based on Spark platform, which update the results of recommendation based on real-time data. The main contents of this paper include:(1) A matrix factorization algorithm based on clustering and feature mapping is proposed for the problem of cold start of the recommendation algorithm. This algorithm cluster firstly the attributes information of the user / product, and get the k nearest neighbor of the new user / new product. Then it mapping the feature of the new users / new product, and use the feature information of the k nearest neighbor calculated the feature vector of the new user / new product, and these feature vector can be used to provide recommendation for the new user / new product, which solve the problem of cold start of the algorithm. The experimental show that the recommendation results of the matrix factorization algorithm based on clustering and feature mapping proposed in this paper is more precise.(2) A stream processing architecture that can calculate in real time is designed for the real-time problem of the recommendation system. This architecture divides therecommendation system into two parts: off-line calculation and on-line calculation,which can make full use of the traditional off-line recommendation algorithm, and combines with the online processing method, and improve the ability of real-time computing of the recommendation system. The recommendation system uses Spark to do online processing, and can be calculated in real-time according to the user’s online score and historical score data set, and achieves the real-time update of the recommendation results.(3)The design and implementation of real time stream computing recommendation system based on spark. This paper analyzed the demand of the real time stream computing recommendation system, including functional requirements analysis, performance requirements analysis and overall structure analysis. Then this paper designed the three key modules of the system: the first module is the design of the analog user ratings module, including the range of data, the format of the data and the frequency and quantity of the data. The second module is the design of the real time stream computing module based on the Streaming Spark, including the design of real time stream computing and some key functions. The third module is the design of the recommendation engine based on Spark MLlib, including training models, testing models and recommendation, etc. Finally, this paper finished the implementation of real time stream computing recommendation system, and implemented the three main functions: analog user ratings, real time stream computing and recommendation engine.
Keywords/Search Tags:cold start problem, real-time problem, real time stream computing, on-line calculation, recommendation system
PDF Full Text Request
Related items