Font Size: a A A

Research And Implementation Of Recommendation Engine Based On Hadoop Platform And Mahout Framework

Posted on:2019-10-29Degree:MasterType:Thesis
Country:ChinaCandidate:J J QianFull Text:PDF
GTID:2428330572958176Subject:Control theory and control engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet,web2.0,the Internet of things,e-commerce and so on have penetrated into all aspects of our lives,and the amount of information has also exploded.How to find useful information from the mass of information becomes a very difficult problem,and one of the most direct and effective methods to solve this problem is the recommendation engine.However,with the increasing of data level,the traditional recommendation engine based on single machine environment can not meet the requirements of enterprises.It has become a trend to introduce big data technology into recommendation field.Therefore,this paper proposes the design and implementation of recommendation engine based on Hadoop platform and Mahout framework.This paper first analyzes the relevant techniques used by big data,then discusses the basic framework of the recommendation engine and the main stream recommendation algorithms,and analyzes several key problems in the recommendation engine,such as scalability.The data sparsity problem and cold start problem are discussed,and the corresponding solutions are put forward.The collaborative filtering algorithm based on users is studied,and the shortcomings of the algorithm are analyzed.The traditional collaborative filtering algorithm based on users only considers the rating level of items,but ignores the time of scoring items,and only considers the similarity between users and users.While ignoring the potential relationship between the user and the item and the influence of the user's own characteristic attribute information on the recommendation result,Aiming at the above problems,the time attenuation function,the preference degree function and the user feature vector are introduced to improve the traditional user-based collaborative filtering algorithm,Combined with MapRedce and Mahout machine learning framework,the improved algorithm is distributed.At the same time,the deployment of Hadoop cluster and the construction of Mahout machine learning framework are completed.A series of comparative experiments on the improved algorithm are carried out by using real movie data sets.The experimental results show that the proposed improved algorithm can improve the accuracy of recommendation.The execution efficiency of distributed recommendation algorithm based on Hadoop platform is much higher than that of single machine recommendation algorithm.In the end of this paper,we design a prototype of movie recommendation engine,which is based on big data platform and movie data set,using the improved collaborative filtering algorithm based on users.It mainly completes the design of the whole system architecture,the whole system flow,the corresponding database design of each part of the system and the design of the system function module.Finally,the film recommendation engine is tested.The test results show the operation of each functional module of the system is normal,and the performance and experience are good.
Keywords/Search Tags:Hadoop, Mahout, recommendation engine, collaborative filtering
PDF Full Text Request
Related items