Font Size: a A A

Research And Implementation Of Video Recommendation System Based On Spark

Posted on:2020-02-29Degree:MasterType:Thesis
Country:ChinaCandidate:J TaoFull Text:PDF
GTID:2428330575466034Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet,data is full of people's daily life,people have entered the era of big data.However,there is a wealth of valuable information in the data.In the face of the explosive growth of data,how to use effective methods to mine valuable information in data is of great significance in the research topic of big data.As an open source distributed big data processing framework,Hadoop uses Hadoop Distributed FileSystem(HDFS)for storage and MapReduce(MR)computing.However,in the face of massive data calculation,MR based computing has been unable to meet the increasing service requirements of users.The birth of Spark has greatly improved this.It is a distributed big data processing framework based on memory computing.It uses Resilient Distributed Datasets(RDD)data model programming,which greatly reduces the disk I/O times compared with MR.It is especially evident in the data calculations of high iterations.Therefore,once Spark was born,it was quickly sought after by many enterprises and scholars.The recommendation system is an effective way to solve the information overload.This paper uses the collaborative filtering algorithm to implement a video recommendation system based on the Spark platform to help users mine the real wanted video information in massive video data.However,the data processing process using the Spark platform takes a long time and cannot meet the needs of users.The distributed Spark cluster can realize data parallelization calculation,thereby effectively improving the calculation efficiency.Through the parallel design of recommendation algorithm and the analysis and design of recommendation system,this paper completes the implementation of video recommendation system based on Spark.Specific work has been done as follows:(1)Parallel design of video recommendation algorithm based on Spark.Firstly,through the understanding of recommendation algorithm and Spark platform and its components,the parallel implementation process of recommendation algorithm in distributed Spark cluster is designed in detail,including the parallel implementation of user-based collaborative filtering and commodity-based collaborative filtering recommendation algorithm.Finally,the performance differences of the recommendation algorithms on Spark cluster and Hadoop cluster are compared through comparative experiments.(2)Implementation of video recommendation system based on Spark.This paper mainly completes the implementation of video recommendation system based on Spark platform.The system preprocesses the acquired Web logs into the database,trains the recommendation model,and generates the recommendation list for users by combining real-time recommendation with off-line recommendation.Data processing and model training are carried out in Spark cluster on Ubuntu system,and video recommendation,business logic module and recommendation list are displayed in Windows system.
Keywords/Search Tags:Big data era, Spark platform, cluster parallelization, collaborative filtering, recommendation system
PDF Full Text Request
Related items