Research Of Sparsity And Scalability Problem In Collaborative Filtering

Posted on:2016-09-06

Degree:Master

Type:Thesis

Country:China

Candidate:X H Li

Full Text:PDF

GTID:2308330479484679

Subject:Control Science and Engineering

Abstract/Summary:

PDF Full Text Request

As the changing from IT era to DT era of society, the information people faced every day is increasing fiercely, information overload has becoming a major obstacle to the development of industry. Especially in the area of electronic commerce, users have to spend more time to choose their favorite ones in the wide variety of goods. Development of recommendation system makes up for this deficiency effectively, especially the collaborative filtering algorithm has got a great success. But as the number of goods becoming bigger, user-item rating matrix is getting sparsely, which brings a deep influence to traditional collaborative filtering algorithm accuracy. In addition, because of computer capability, there are computational efficiency and scalability in traditional collaborative filtering algorithm facing big data.This paper studies data sparsity and poor scalable when facing big data of collaborative filtering algorithm, and fills user-item rating matrix to decrease data sparsity and uses a distributed algorithm to improve scalability of the algorithm.First, too little scores will bring data sparsity of user-item rating matrix, reduces accuracy of collaborative filter algorithm. This chapter proposes assist-factor similarity from the overall distribution of rating vector of item, and binds assist-factor and traditional similarity calculation method together, proposes collaborative filter algorithm based on assist-factor. In the projects of too little common rating scores, we improved accuracy problems of insufficient recommended. Experiments show that, the algorithm can effectively ease data sparsity and improve recommended accuracy.Second, in order to deal with poor scalability facing big data in collaborative filter algorithm, the chapter implements one distributed implementation of collaborative filter recommendation algorithm based on Hadoop. Multiplying the user’s preference vectors and co-occurrence matrices to get recommended items, Dynamically increasing the cluster nodes to improve scalability. In the multiplication, the algorithm selects one improved partial product method instead of traditional matrix multiplication, reduces a large number of invalid calculations because of null in the matrix and improves computing resource utilization. At last, the experiments show that this algorithm can effectively improve the computational efficiency and has good scalability when facing big data.

Keywords/Search Tags:

Collaborative Filtering Recommendation Algorithm, data sparsity, scalability, distribute, Hadoop

PDF Full Text Request

Related items

1	Research On Key Problems Of Collaborative Filtering Algorithm In Recommendation System
2	Research On Key Problems Of Collaborative Filtering Recommendation Algorithms
3	Research On Distributed Collaborative Filtering Recommendation Algorithm Based On Hadoop
4	Research And Application Of Recommendation System Based On Hadoop
5	Research On Collaborative Filtering Recommendation Algorithm Faced To Sparsity Matrix Bias
6	Collaborative Filtering Recommendation Algorithm On Data Sparsity Problem From Statistical Perspective
7	Research On Data Sparsity Of Collaborative Filtering Recommendation Algorithm
8	Research On Sparsity And Scalability Problem In Collaborative Filtering
9	Research On Collaborative Filtering Recommendation Algorithm Based On Social Network
10	Research And Implementation Of Collaborative Filtering Recommendation Algorithm Which Faced To Sparsity Data