Research Of Scalable Collaborative Filtering Algorithm Based On Mapreduce

Posted on:2016-09-18

Degree:Master

Type:Thesis

Country:China

Candidate:Y Shang

Full Text:PDF

GTID:2308330470478594

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

With the rapid development of computer network technology and storage technology, information shows explosive growth in the Internet. The time of Big Data has coming. It is like looking for a needle in the ocean for users to search useful information in massive data. It becomes a hot topic for both academic and business circles about how to excavate and offer the most valuable information to users from the massive information. In recent years, recommendation system has raised worldwide as an intelligent individual information service technology, and is applied in e-commerce, video entertainment, social network and many other areas. Nowadays, individual recommendation technology has become a very important research direction among big companies and institutions.With the development of many years, recommendation system has derived into collaborative filtering recommendation, basing on content recommendation, hybrid recommendation and so on. Among these, collaborative filtering recommendation is the most mature and the most popular one. However, the collaborative filtering algorithm has its own problem, like data sparseness and scalability problem. Especially under the background of Big Data, these problems are magnified largely, which becomes the bottleneck of its development.This paper lucubrates the reason of the scalability problem of the traditional collaborative filtering algorithm, combining with practical application environment of recommendation system, having a discussion about similarity calculation of the algorithm. Focusing on pretreatment of input data, an improvement is applied to the collaborative filtering algorithm based on users. The improved algorithm utilizes hierarchical inverted index structure based on "Bag-of-Words" model to filter valid data, and proposes a "soft-assignment" strategy to make up the error of data filtering.For the achievement of algorithm, cloud computing technology brings a new solving idea for the scalability problem. Under the background of Big Data, the best choice is adopting parallel implementation to the algorithm. This paper analyzes the operation procedure of Hadoop cloud computing platform and the programming thought of MapReduce distributed framework, and designs a parallel implementation based on MapReduce to the improved collaborative filtering algorithm.The method proposed in this paper was experimented via real data set and simulated data set on Hadoop platform. The result demonstrates that the improved method can solve the scalability problem efficiently, comparing the traditional collaborative filtering algorithm. And keep the recommendation accuracy of the recommendation algorithm at the same time.

Keywords/Search Tags:

Recommendation system, Collaborative Filtering, MapReduce, Inverted index

PDF Full Text Request

Related items

1	Research On Collaborative Filtering Recommendation Algorithms Based On Mapreduce
2	Design And Implementation Of Online Mall Recommendation System Based On Mapreduce
3	Design And Implementation Of Distributed Movie Recommendation System Based On Collaborative Filtering
4	Research On Clustering Collaborative Filtering Recommendation Algorithm Based On MapReduce
5	Research On Collaborative Filtering Recommended Algorithm And Implementation Of MapReduce
6	Research On Distributed Collaborative Filtering Recommendation Technology For Search Application
7	Research On Collaborative Filtering Recommendation Algorithm Based On Big Data
8	Design And Implementation Of Recommendation System Based On User Collaborative Filtering Algorithm
9	Personal Recommendation Based On Collaborative Filtering
10	Design And Implementation Of Travel Recommendation System Based On Improved Collaborative Filtering Algorithm