Font Size: a A A

Research On Data Sparsity And Cold-Start Problem In Collaborative Filtering Recommender System

Posted on:2019-07-15Degree:MasterType:Thesis
Country:ChinaCandidate:X J LiFull Text:PDF
GTID:2428330566960646Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The recommender system can automatically mine the user's interests based on the historical behavior of users,and then provide the users with personalized recommendations.As one of the most successful algorithms in personalized recommendation,collaborative filtering has attracted the attention of researchers and has been widely used.Since collaborative filtering only uses the rating data to generate recommendations,there are serious data sparsity and cold-start problems.However,from another perspective,the content of items is generally available,and the content of items contains the characteristics of the items themselves.Therefore,this thesis considers item content as a useful supplement to rating data and integrates item content into collaborative filtering to mitigate data sparsity and cold-start problems.The main work and contributions of this thesis are as follows:1.To alleviate the data sparsity problem,this thesis takes both the content of item and rating matrix into account,and proposes a probabilistic matrix factorization model based on the Competitive Recurrent Autoencoder(CRAE)called CRAEmf.CRAEmf can automatically extract the content feature representation of the item from their content,and then associates the content feature with latent features of matrix factorization to alleviate data sparsity problem.CRAEmf can capture the semantic and contextual information of item content,in addition,we introduce the competition mechanism into CRAEmf so that the model can extract the key feature from the content of item and discard redundant information,Which further improve the rating prediction accuracy.Experiments on three real-world datasets have shown that our model can outperform other recommendation models that also integrate the content of item even when the rating data is extremely sparse.2.To address the item cold-start problem,this thesis proposes a mapping model from the content features to latent features of items.Matrix factorization uses a latent feature vector to represent an item,but for the new item without rating data,we cannot accurately obtain its latent feature vector through matrix factorization.Therefore,we first use CRAE to automatically generate item content features from item content,and then we propose three mapping methods from the content features vector to latent features vector of items.In this way,we can obtain the latent vector of the item without rating data according to its content information through mapping method,and add it to the matrix factorization model so that we can solve the item cold-start problem.In addition,if a new item receives rating from user,we propose an update algorithm that dynamically updates the latent feature vector of item based on the rating,which achieves a smooth transition from new item to old item and further improve the accuracy of the recommendation.Finally,we verified the effectiveness of our algorithm in dealing with the item cold-start problem in three real datasets.
Keywords/Search Tags:Collaborative Filtering, Data Sparsity Problem, Cold-start Problem, Autoencoder, Probabilistic Matrix Factorization
PDF Full Text Request
Related items