Font Size: a A A

Research And Implementation Of Movie Recommendation System Based On Hadoop

Posted on:2020-10-01Degree:MasterType:Thesis
Country:ChinaCandidate:X ChenFull Text:PDF
GTID:2428330578450891Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In the era of Web2.0,the application of the Internet has become more and more extensive,and a large number of movie resources have emerged on the network.In order to find interesting things in such large and complex movie resources,the recommendation system has been widely used.Big data shows that more and more people in the modern society prefer to watch movies on movie websites,so movie websites have good development prospects in China,and building an accurate and efficient recommendation system is the key to setting up the movie website.The film recommendation system designed in this paper is based on the Hadoop platform.Hadoop is a scalable,efficient and open source distributed framework.For modern movie recommendation systems,the very fast-growing big data storage and computing is the biggest the problem,while Hadoop has unparalleled advantages in solving big data problems.MapReduce distributed framework can realize big data calculation,HDFS distributed file system can realize big data storage.The key to the recommendation system is the recommendation algorithm.This paper proposes a clustering collaborative filtering recommendation algorithm under the Hadoop platform.First,the Canopy algorithm is used to coarsely cluster similar users according to the user's score record for the movie.Then K-Means iterative calculation is performed on the users in the same Canopy.The number of Canopy clusters is taken as the K value,and the Pearson correlation coefficient is used as the distance formula to accurately cluster the users.The Pearson correlation coefficient reflects the correlation between the two vectors,which takes into account the difference in scores between different users.Finally,the user's nearest neighbor set is constructed based on the clustering result,and the predicted score is calculated to generate the recommendation result.The combination of Canopy and K-Means not only reduces the amount of calculation to a certain extent,but also makes the clustering result more accurate.The comparison experiments on the Movielensdataset show that the proposed algorithm is more optimized in terms of accuracy and scalability.The research and implementation process of this system is as follows:1.Demand analysis phase: Analyzing the implementation goals of the system.First analyze the feasibility of the system,and then analyze the functional and non-functional requirements of the system from the perspectives of administrators and users.2.Design phase: Design the overall architecture,functionality,and database of the system.3.Movie recommendation algorithm: The recommendation algorithm proposed in this paper combines clustering with collaborative filtering recommendation.It is implemented on Hadoop platform.Firstly,according to the users`scores on the movie,clustering algorithm is used to cluster similar users together,and then Pearson correlation coefficient and thresholds are used to construct the nearest neighbor set,and finally the predicted scores of the target users for items that are overrated by similar users are calculated,and the recommended results are generated by sorting them.The whole algorithm is based on Hadoop platform.The distributed parallelization framework of MapReduce realizes the scalability of the algorithm.HDFS realizes the scalability of massive data storage.At the same time,based on the Movielens data set,the comparison experiment is carried out to verify the accuracy of the proposed algorithm.4.Implementation and testing phase: Based on the Eclipse platform,the front-end interface is coded by using JSP,JavaScript,etc.Based on the Hadoop platform,the Java language is used to complete the code writing of the algorithm part,and the function of the system is tested.
Keywords/Search Tags:movie recommendation, Hadoop, Collaborative Filtering recommendation algorithm, K-Means, MapReduce
PDF Full Text Request
Related items