Font Size: a A A

Collaborative Filtering Recommendation Algorithm And Implementation Based On Sparse Data

Posted on:2020-02-25Degree:MasterType:Thesis
Country:ChinaCandidate:L L GuoFull Text:PDF
GTID:2428330590971744Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Collaborative filtering can accurately predict the data needed by individuals in the future based on their previous behavioral data information.With the popularization and application of computer science and technology,large-scale data has been generated in the Internet,which has brought great difficulties to people's choices.Therefore,there are theoretical significance and application value to research on collaborative filtering recommendation.In this thesis,the problems of sparse data will be researched that based on a detailed analysis of the existing problems in collaborative filtering recommendation.The main work includes:Firstly,this thesis proposes a strategy based on user score preference to solve the problem which is the failure of the similarity judgment strategy in data sparse environment.First of all,the user score value trust degree is introduced on the basis of Pearson correlation coefficient,and the score value trust degree weight is used when calculating the similarity between users to analyze the trust relationship between users;next,the high score group and low score group are found out,and the user rating preference model is constructed by mining the real user preference information hidden behind the user rating value;the last,by combining the improved similarity evaluation and scoring estimation method,the similarity among users can be judged more accurately,so as to obtain more compact user categories and more accurate estimation results.Secondly,aiming at the problem of sparse data and the fuzzy of user interest,this thesis proposes the strategy of fuzzy clustering fusion.In the calculation of Euclidean distance,the fuzzy c-means algorithm only considers the rating data of users,which is too one-sided.In this thesis,attribute distance between users is defined as weighted fusion of attribute distance between users and Euclidean distance between scoring values,and then the comprehensive distance between users is calculated.Aiming at the problem which is very sensitive to isolated points and easy to generate local optimal problems for the fuzzy c-means algorithm,this article combines the advantages of k-means++ algorithm and k-mediods algorithm to filter the cluster center firstly and then make class center of the final result as the initial class center in order to improve the fuzzy c-means algorithm and optimize the clustering result;to solve the problem of sparse data for movie rating in MovieLens,this thesis converts the user-movie rating matrix into the user-movie type preference matrix by using the movie type information,so as to reduce the dimensionality and make the data no longer sparse.Thirdly,in order to verify the theory introduced in this thesis with practical data in a workable system,a collaborative filtering recommendation prototype system is designed and implemented in this thesis,which can complete functions such as movie crawler,movie recommendation,popular movie playing and movie scoring.
Keywords/Search Tags:collaborative filtering, sparse data, fuzzy clustering, score preferences, recommended system
PDF Full Text Request
Related items