Font Size: a A A

A Research Of Data Sparsity Problem And Real-time Recommender In Collaborative Filtering

Posted on:2017-05-03Degree:MasterType:Thesis
Country:ChinaCandidate:J XuFull Text:PDF
GTID:2308330503461543Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Nowadays, the explosion of data makes Recommender System(RS) becomes more and more popular, more and more important. Almost all of the e-commerce websites are using Recommender System in their area, so do the Search Engines. The reasons why RS is so popular and important is that a huge amounts of information is more available and what people want is only the small part of it.. Secondly, there are some underlying preference that people expect to obtain can be used for other actions. Recommender System’s development is very rapid, it contains Collaborative Filtering, Knowledge-based Recommendation, Hybird Recommendation and so on. And there are many mature technology in these recommendations. Collaborative Filtering is one of the most successful and widely used technologies in E-commerce among them. Typical Collaborative Filtering algorithm is maintaining a matrix of the item and its scoring, which is used to computer user similarity to generate recommendations.Unfortunately, the performance of this algorithm will have a lot of problems with the increase of users and commodities: high degree of dimension, high sparsity. These two problems bring huge challenges in the application of E-commerce. In this paper, we are trying to based on the original algorithm, a number of methods are improved or proposed in order to solve the accuracy of data sparsity problem and real-time recommendation.For the data sparsity problem, in order to increace the accuracy of the recommendation, we extract external information to calculate their Jarccard similarity & predictions and populate the predictions into the training set. For detail, the external information like: users’ age, occupation, items’ catagories and so on, which will be calculated the Jaccard similarity. The similarity will through weighted fusion with the similarity calculated by typical Collaborative Filtering KNN( Knearest neighbor) similarity. After many experiments were done, we have determined the sets of weight. Under the sets of weight, the results is better than the base experiment.To solve the real-time recommendation problem, we introduce Clustering Using Representatives in the clustering area to transform the search of all users into the search of representatives, which finally decrease the scale of searching space to improve the real-time ability. The key to this method, i.e., the core work of this paper, is the way to calculate representatives. According to the previous researchers, we have modified some of their method. Many experiment were done as well, and the result of our method is better.
Keywords/Search Tags:CURE, Recommender Systems, Movielens, Collaborative Filtering
PDF Full Text Request
Related items