Font Size: a A A

Research And Implementation Of Spark-based Product Personalized Recommendation System

Posted on:2022-08-07Degree:MasterType:Thesis
Country:ChinaCandidate:X X LinFull Text:PDF
GTID:2518306530480244Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
By studying the shortcomings of traditional recommendation algorithms,this study proposes a hybrid recommendation algorithm combining offline recommendation and real-time recommendation for the special scenario of commodity personalized recommendation,thus effectively solving the problems of data sparsity,real-time and cold start in the recommendation system.The main research contents of this paper are as follows.(1)In offline recommendation,firstly,we designed a similarity calculation method based on user contribution degree for the problem that it is difficult to calculate item similarity in large-scale data based on item collaborative filtering algorithm.The method is based on confidence and introduces the TF-IDF idea to calculate the user contribution degree.Then,we implemented an improved calculation method by borrowing the idea of Apriori algorithm,but the Apriori algorithm is difficult to implement on large-scale data sets.By further studying the parallelization mechanism,we design and implement the parallelization algorithm on Spark to improve the computational efficiency.Through the analysis,we found that the interest focus of users changes with time,and we proposed a time-based user interest degree weighting.First,we use the improved Item CF to select the candidate set,and then we build the feature engineering through the process of data analysis,data pre-processing and feature selection.Finally,we get the recommendation results based on the XGBoost model and the features of the candidate set.The experimental results show that the accuracy of recommendation is significantly improved.(2)In real-time recommendation,we take the improved Item CF as the core of realtime recommendation algorithm and adopt the weighted sampling method of reservoir to update the item similarity matrix incrementally.According to the sampling results,different update strategies were adopted for users to achieve personalized results of real-time recommendation.For the cold start problem,we designed a list based on the Newton cooling algorithm as a recommendation supplement for new users.Finally,we designed the architecture of the real-time recommendation system,and implemented the leaderboard and real-time recommendation algorithm based on this architecture.The final experimental results show,the real-time recommendation algorithm improves the accuracy of recommendations and meets the real-time requirements of the system.(3)Based on the hybrid recommendation algorithm proposed in this paper,we designed and implemented a Spark-based product personalization recommendation system,and achieved the basic functions of product personalization recommendation.
Keywords/Search Tags:Collaborative filtering, recommendation system, spark, hybrid, recommend system, XGBoost
PDF Full Text Request
Related items