Font Size: a A A

Improving Collaborative Filtering Based On Information Entropy And Earth Mover's Distance

Posted on:2019-03-07Degree:MasterType:Thesis
Country:ChinaCandidate:J J ZhouFull Text:PDF
GTID:2428330566486573Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Recently,recommender system has become a preference for solving the information overload problem.Collaborative filtering,in which similarity metric play the role of core component,has been most widely applied in recommendation system.Although researchers have explored many metrics to measure similarity between users or items,recommendation accuracy still suffer from the deficiencies of those metrics.The traditional similarity metrics suffer from following challenges:(i)It's insufficient to characterize user's taste preference merely according to absolute offset of user rating;(ii)The small number of co-rated items caused by sparsity problem leads to inaccuracy of similarity metrics,which calculate similarity mainly based on rating of co-rated items;(iii)Most of the metrics failed to efficiently discover buried information,which is capable of characterizing user's taste preference,in user ratings.Recommendation accuracy can be directly affected by these challenges.This thesis proposes several novel improvements aiming at alleviating aforementioned problems for similarity metrics.The major improvements are:(i)Proposed an improved metric based on information entropy and skewness of user rating distribution,which is used to describe user interest preference due to its attachment to user rating behavior;(ii)Proposed an improved metric based on Earth Mover's Distance(EMD)that is used to measure distance between item rating distributions.Item similarity is obtained through a nonlinear mapping of the EMD of those two item rating distributions.The accuracy of user similarity will benefit from user ratings between those non-corated items;(iii)Proposed an enhanced metric by modeling item rating distribution through an asymmetric Laplace distribution(ALD).The buried information,called surprisal,can be explored by ALD and user similarity is calculated based on user surprisal vector;(iv)Proposed a new hybrid metric merging multiple similarities from multiple metrics.High similarity close to one is amplified through a nonlinear function and the resultant value is considered as a voting,given by the corresponding metric,on similarity between active user and neighbor user candidate.All the voting are accumulated and converted to a new similarity.The metrics proposed in this thesis have been evaluated through experiments on MovieLens datasets with different sizes and compared in terms of Mean Absolute Error(MAE)analysis.The results from the experiments verified that the proposed metrics possess better accuracy compared with traditional metrics and releted metrics.
Keywords/Search Tags:Collaborative Filtering, Similarity Metric, Information Entropy, Skewness, Earth Mover's Distance, Asymmetric Laplace Distribution, Surprisal
PDF Full Text Request
Related items