Font Size: a A A

Research On Top-N Recommendation With Collaborative Filtering

Posted on:2015-04-24Degree:DoctorType:Dissertation
Country:ChinaCandidate:X Y ZhaoFull Text:PDF
GTID:1228330422493443Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the rapid development of computer and web technology, the Internet runsgradually into people’s daily lives, and a complete change has occurred in the way forpeople accessing information. Vast amounts of web information provide lots of usefulcontents. However, this lead to a problem named inforamtion overload, which makes itbecoming increasingly hard for people to find relevant information. Recommender systemshave been introduced in recent years to help people in retrieving potentially usefulinformation or products. Meanwhile, recommender system is a hot research topic in thefield of Big Data, and an important transition technology from the web age to theinformation age. Since the great value of recommender system in theory and application, itgains much attention from both academia and industry.The main goal of recommender systems is to find some recommendations that matchusers’ interests. In hence Top-N recommendation is the core target. Collaborative filtering(CF) is a leading approach to build recommender systems which has gained considerabledevelopment and popularity. However, there are still some drawbacks which hinder itsfurther development, including data sparse, cold start, popularity bias, scalability, temporal,accuracy and diversity. This paper focuses on solving data sparse and popularity bias of CF,in order to gain improvement on Top-N recommendation task. The main research works andcontributions are listed as follows:1) Existing recommender systems using CF suffer from popularity bias problem.Popular items are always recommended to users regardless whether they are related tousers’ preferences. In order to solve it, this paper proposes an opinion-based CF approach(OWUserCF). By analyzing the reasons and influences of popularity bias and comparisonwith some solutions, an assumption is proposed that a user’s rating on an item meansdifferently to his/her preference according to the popularity of the target item. Based on theassumption, OWUserCF introduces weighting functions to adjust the influences of popularitems according to item popularities and user opinions. Experiment results show thatOWUserCF outperforms the baseline approaches on Top-N recommendation task, whichindicates that OWUserCF has solved the popularity bias problem in a certain degree.2) Data sparse is a main challenge in the area of CF. That is there are only a fewobserved ratings, lots of items which have not been rated are missing data. This paper focuses on finding the reason why these data are missing. It is found that data are missingnot at random. A part of missing data is due to that users choose not to rate them. This partof missing data are negative examples of user preferences. Utilizing this information isexpected to leverage the performance of recommendation algorithms. Unfortunately,negative examples are mixed with unlabeled positive examples in missing data, and it ishard to distinguish them. This paper proposes three schemes to model the negativeexamples in missing data, including weighting scheme, random sampling scheme, andneighbor-based sampling scheme. The schemes are then adapted with SVD++, which is astate-of-the-art matrix factorization recommendation approach, to generaterecommendations. Experiment results show that our proposed approaches gain better Top-Nperformance than the baseline ones on both accuracy and diversity.3) Conventional CF approaches are based on an implicit underlying assumption thatusers randomly select the items which they rate in recommender systems. However, usersare free to choose which items to rate. In our opinion, users always rate the items that theywant to rate, especially in the age of information overload. As a result, a Two-layer Modelof User Behavior (TMUB) is proposed by dividing user behaviors into two layers. The firstlayer is that the current user selects an item to rate. The second one is rating it with a value.This paper analyzes the difference between TMUB and conventional model, and verifiesthe effectiveness of TMUB by data analysis with a realworld dataset of recommendersystems. A two-step recommendation framework is then proposed as a simulation of TMUB.That is predicting the probability that user u rates item i (in the first step), and thenpredicting the value which u may rate i with (in the second step). Furthermore, two-stepneighbor-based recommendation algorithms and a two-step model-based recommendationalgorithms are proposed based on the framework. Experiment results show that thesetwo-step recommendation algorithms outperform benchmark approaches on both accuracyand diversity, which demonstrate the effectiveness of TMUB and two-step recommendationalgorithms.
Keywords/Search Tags:Top-N recommendation, collaborative filtering, popularity bias, data sparse, modeling missing data, two-layer model of user behavior, two-step recommendationalgorithm
PDF Full Text Request
Related items