Font Size: a A A

Research And Implementation Of Movie Recommendation System Based On Spark

Posted on:2022-01-18Degree:MasterType:Thesis
Country:ChinaCandidate:T LeiFull Text:PDF
GTID:2518306347992599Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In recent years,the number of Internet users has increased dramatically,which has promoted the rapid development of Internet video platforms.Simultaneously,the cooperation between online video service providers and traditional TV media has also enabled the online video market to grow year by year.While enjoying the rich and varied video feast,Internet users are constantly being impacted by a large amount of redundant and useless information.These enormous amounts of data and information are far beyond what users can afford and seriously interfere with users' selection of the information they need.This leads to a shallow utilization rate of information and at the same time makes users feel bothered and disgusted.This is the "Information Overload" problem caused by the information age.In response to this problem,researchers put forward the concept of recommendation sys-tems.The emergence of the recommendation system makes this problem effectively solved.However,the recommendation system also faces many challenges.For example,the vast data scale and complex algorithm design will lead to the recommendation algorithm's low efficiency.Data sparseness and cold start problems can also lead to poor recommendation results.Therefore,this thesis will focus on the two aspects of recommendation algorithm and system architecture development,taking movie recommendation as the actual require-ment,and designing an efficient and usable recommendation system under the background of big data.At the system design level,the Spark distributed platform is selected as the basis of the recommendation system.The overall system is divided into the application layer,computing layer,and data layer in terms of architecture design.After completing the architecture design,the Spark-based movie recommendation system's requirements analysis is carried out and the functional modules are divided into offline computing module,real-time computing module,system business module,and data loading module.At the recommendation algorithm level,this thesis designs and optimizes related algorithms around the two core modules of offline recommendation and real-time recommendation.In offline recommendation,in order to deal with the problem of data sparseness of the recom-mendation system,a collaborative filtering algorithm based on a latent factor model(LFM)is selected and is solved by the alternating least square method(ALS).At the same time,it also uses Spark to parallelize the calculation process of the ALS algorithm and uses Spark's partitioning and caching mechanism to reduce the data transmission between the various computing nodes of the cluster.Thereby,the target of reducing communication complex-ity and improving the calculation speed of offline recommendation is achieved.In terms of real-time recommendation,in order to dynamically perceive the user's interest changes over a period of time a movie recommendation priority algorithm incorporating time weight is designed.Finally,in the offline recommendation experiment,it is found that the improved offline rec-ommendation algorithm has a significant increase in computing speed.In terms of accu-racy,this system's offline recommendation algorithm also has advantages compared with other classic algorithms.In the real-time recommendation experiment,the movie recom-mendation priority algorithm fused with time series has a good performance in metrics such as accuracy,and it also meets the system design requirements in the real-time test.After verifying the effectiveness of offline recommendation and real-time recommendation algo-rithms,they are applied to the recommendation system,and each functional module can cooperate well so that the system can operate normally.
Keywords/Search Tags:Collaborative Filtering, Movie Recommendation, Spark, Alternating Least Squares, Latent Factor Model
PDF Full Text Request
Related items