Font Size: a A A

The Research And Implementation Of Movie Recommendation System Based On Flink

Posted on:2021-05-12Degree:MasterType:Thesis
Country:ChinaCandidate:L Y ZhangFull Text:PDF
GTID:2428330623967862Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
With the continuous development of Internet construction and the cooperation between network video operators and traditional TV media,the scale of network video market is increasing year by year.While network users enjoy a variety of forms and content of video feast,they are constantly impacted by a large number of redundant and invalid information.These huge data information far exceeds the user's ability to bear,which seriously interferes with the user's correct choice of the information they need,resulting in a very low information utilization rate and even bringing users trouble and antipathy.It's effective to use recommendation system for the problem which is called information overload.Recommendation system has been developing rapidly in recent years,but it also faces many challenges.For example,the large scale of data and the complexity of algorithm design lead to the low efficiency of recommendation algorithm,the sparsity of data and the cold start problem lead to the poor recommendation effect and so on.In this paper,we study several common recommendation algorithms and analyze their advantages and disadvantages,aiming to improve and optimize these algorithms,so as to improve the system recommendation effect.First of all,by analyzing the data structure characteristics of MovieLens data set,it is determined that the offline recommendation function of the system is implemented by collaborative filtering algorithm.The underlying computing engine is based on the distributed big data computing platform Apache spark.By expanding the automatic partition function of spark and customizing the data partition method,the data transmission between nodes in the cluster and the calculation amount of Cartesian product are reduced,and the uniform hash of data on each node and the uniform distribution of execution tasks are ensured.The improved collaborative filtering algorithm has a significant improvement in execution speed compared with the previous algorithm,but the prediction accuracy is almost the same unchanged.Secondly,a real-time recommendation algorithm is designed for users' rating behavior in the recommendation system.Based on the movie tag information,this algorithm combines TF-IDF algorithm to calculate the similarity of movies.At the same time,the time weight factor is introduced to construct the formula of user interest,which can generate real-time recommendation list for users.The simulation results on the Apache Flink flow computing platform show that the time weighting factor in the algorithm has an impact on the accuracy and recall rate of the real-time recommendation algorithm,and when the time weighting factor ? is 0.25 and ? is 0.6,the two indexes get the maximum value.In other words,the recommendation effect is the best at this time.Finally,based on the above recommendation algorithm,a user-friendly movie recommendation system is implemented by using the current mainstream application development framework and related components.The whole recommendation system mainly includes four parts: data loading module,offline recommendation module,realtime recommendation module and system business module.The implementation efficiency of real-time recommendation algorithm based on Flink is also significantly higher than that based on Spark Streaming.
Keywords/Search Tags:collaborative filtering recommendation, Apache Spark, Apache Flink, user interest formula, time weight
PDF Full Text Request
Related items