Font Size: a A A

Improved Collaborative Filtering Algorithm Based On LDA Model And Kernel Method

Posted on:2018-04-03Degree:MasterType:Thesis
Country:ChinaCandidate:J F SongFull Text:PDF
GTID:2417330569985094Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
With the rapid development of the internet,nowadays,information overload makes it difficult for people to find the real demand in abundant information,which becomes one of the foremost challenge for the new times.Present information system has rich historical data on the users behavior.By analyzing this data,recommended system can dig users' potential interest,realize accurate recommendation,and finally filtrate information effectively,therefore,this system has great application value.At the same time,however,recommended system also faces many problems.This thesis mainly aims at two problems--data's noise and sparsely,proposing solutions to improve it.For the problem of data noise,this thesis constructs LOF outlier detection algorithm.It respectively bases on users common project evaluation and users evaluation distribution characteristic index two methods to calculate users outlier factor.Through comparing effect of these two ways,it indicates that constructing index data performs better than basing on original common project evaluation.For the problem of data sparsely,this thesis builds LDA-CF and Kernel-CF hybrid model.1)The model of LDA-CF is according to the theme to generate model's thought,which means assuming users like one project because they likes some implicit themes,utilizing users evaluation data to produce false document,calculating users potential themes distribution and projects distribution under the potential themes,and then following the similarity of users themes distribution and projects themes distribution and combing neighborhood method to predict users preference.2)The model of Kernel-CF supposes users evaluation submitting to one of the stable distribution,takes advantage of the nuclear density estimation to estimate each user's evaluation distribution density function severally,and then on the basis of the users evaluation density distribution density function to calculate users similarity,finally,associates with neighborhood method to predict users preference.Via experimental verification on the MovieLens DS,it shows that two kinds mixed collaborative filtering,both of them are superior to collaborative filtering algorithm basing on users and projects on the aspect of RMSE performance index.In the end,this thesis introduces a new application of recommended algorithm on the class interactive platform: making use of Kernel-CF to predict student's scores through answering questions.
Keywords/Search Tags:Recommender System, Collaborative Filtering, LDA Algorithm, LOF Algorithm, Kernel Methods
PDF Full Text Request
Related items