Font Size: a A A

Research On Adversarial Pairwise Learning For Recommendation Algorithm

Posted on:2020-05-06Degree:MasterType:Thesis
Country:ChinaCandidate:Z C SunFull Text:PDF
GTID:2428330575971460Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The explosive growth of data leads to information overload more serious.The recommender system is one of the valid solutions to filter information and alleviates information overload,which has been widely used in many websites.The goal of recommender systems is to provide users with a list of their favorite items via modeling their historical feedback.It is a challenge to model user preference from implicit feedback,which only contains user historical behaviors.The items are divided into observed and unobserved sets by traditional pointwise and pairwise ranking methods,basing on that division to capture user preference.However,the observed set includes user disagreeable items,and the unobserved set covers user favorite items.Thus,the performances of learned models are restricted from noised implicit feedback.The proposal of Generative Adversarial Net(GAN)provides another method for modeling implicit feedback,i.e.adversarial learning.However,the standard GAN is designed for differentiable values,which objective function causes unstable training and slow convergence.Although the policy gradient is a common method to train the non-differentiable generative model,the high variance of the estimated gradient leads to the adversarial training more unstable.To solve the issues mentioned above,this paper proposes a novel Adversarial Pairwise Learning(APL)recommendation method.APL contains two components,a generator and a discriminator.The generator aims to capture user preference and generate user favorite items.The discriminator tries to distinguish the preference between generated items and observed items,then guides the generator's training.Combining the advantages of adversarial learning and pairwise ranking,APL adopts pairwise loss as its objective function to stabilize the training process and speed up model convergence.Furthermore,to further stabilize the training process,the discrete item sampling is replaced with a differentiable process.Therefore,both the discriminator and generator can be trained by standard gradient descent methods.In APL,the update information of the generator is directly derived from the discriminator,not from the implicit feedback.Thus,the generator can focus on modeling user preference,which is not limited by the division of observed items and unobserved items.Extensive experiments on four real-world datasets(i.e.Gowalla,Yelp,Yahoo and Pinterest)show that(1)Compared with IRGAN,which uses policy gradient and pointwise loss to train adversarial model,APL stabilizes the training process and speed up model convergence.(2)Compared with other methods,APL achieves better recommendation performance in various recommendation scenarios.
Keywords/Search Tags:recommender system, adversarial learning, pairwise ranking, matrix factorization, implicit feedback
PDF Full Text Request
Related items