A Research And Implementation Of Recommender System Based On Mahout And Hadoop

Posted on:2017-04-13

Degree:Master

Type:Thesis

Country:China

Candidate:G X Song

Full Text:PDF

GTID:2308330488468504

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

In recent years, with the rapid development of Internet, e-commerce, as its representative one, data and information increase explosively. It makes difficulty for us to choose the real need target from a large quantity of items. To meet this demand, detailed study on recommender system that plays an increasingly important role in today’s society will have a greater practical significance. Improving the accuracy of recommender system is not only for reaping huge economic benefits, but also for the users of the system with more personalized and convenient services.Collaborative filtering algorithm in recommender system has broad and successful applications. But such an algorithm’s performance is not satisfactory in scene of sparse data. From the start with the basic concepts of recommender algorithm, discussed a number of different ways to calculate the similarity of collaborative filtering algorithms, and proposed a new similarity measure using Bhattacharyya coefficient. Experiments on open source data published by MovieLens, Netflix and Yahoo Music verify the validity of the new way to calculate similarity in collaborative filtering algorithm. Recommender system, as a data-intensive system, is prone to explosive data growth, the paper also analyzed calculation principle of Hadoop distributed computing framework, as well as the part of recommender algorithms in well-known machine learning framework Mahout are discussed in detail, and told its convenience about implementation of collaborative filtering algorithm using Bhattacharyya coefficient as its measure of similarity. At last, we discussed the principle of combining these two frameworks.Finally, a systematic design and prototype realization are given. Specific introduction of collaborative filtering algorithm based on Bhattacharyya coefficient’s implementation process based on Mahout, and showed the source code. By the inevitable demand of long-running system, scheme and steps of system’s migration to Hadoop distributed computing framework were given. Combined with Mahout and Hadoop, the system can solve the problems of big data’s storage and computation well.Concluded, the innovation of this paper is mainly reflected in the following two points:1) Due to the collaborative filtering algorithms’ relying on common rated data, recommender system’s results are not accurate enough in sparse data. We proposed a new similarity measure using Bhattacharyya coefficient to solve this problem. Experiments on open-source data proved the validity of the new way in sparse scene.2) To make the collaborative filtering algorithm using Bhattacharyya coefficient can be put into practical application, we implemented this algorithm based on Mahout framework, the source code of the key steps were given.

Keywords/Search Tags:

recommender system, collaborative filtering, Bhattacharyya coefficient, sparse data, Mahout, Hadoop

PDF Full Text Request

Related items

1	Research And Implementation Of Recommender System Based On Hadoop And Mahout
2	Research On Collaborative Filtering Recommendation Algorithm Based On Big Data
3	Research On Collaborative Filtering Recommendation Algorithm Based On Big Data
4	Research On The Collaborative Filtering Technique Based On Bhattacharyya Coefficient And Clustering
5	Research Of Commodity Recommender System Based-on Internet Users’ Features
6	A Mahout-based Collaborative Filtering Recommendation Engine: Research And Implementation
7	Based On Collaborative Filtering A Customized Movie Recommender Web Service Design And Implementation
8	Research And Application Of Video Recommendation Technology Based On Hadoop And Mahout
9	Research On Data Prediction Based On Filtering Fusion
10	Collaborative Filtering Algorithm Oriented User Interests