Font Size: a A A

Research On Personalized Recommendation Algorithms Based On Clustering And Collaborative Filtering

Posted on:2013-02-17Degree:MasterType:Thesis
Country:ChinaCandidate:G LiFull Text:PDF
GTID:2218330374465576Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of computer network technology,the increasing popularity of e-commerce cyber-service and the continuous addition of online users and information resources,when the users are confronted with a great choice of information,they will often be drowned and get lost in the information ocean. It is very difficult for people to quickly locate and accurately find out the resources which they needed.The problem of information overload becomes more and more serious. Customers urgently expect that network information systems have a certain kind of active function of recommendation assistant which can provide personalized service and help them to make decisions. So the personalized recommender system was timely advanced in season,and becomes the main important methods to resolve these problems.The personalized recommendation algorithms are the core of the recommeder systems,and mainly determine the performance of the recommender systems to a great extent. Among them,collaborative filtering is one of the most popular personalized recommendation technology currently,and is the most extensive and successful approach which the e-business websites adopt to different degree.According to the ratings which other users of the similar interests and preferences with the target users evaluate the items,it creates prediction recommendations of unvalued items to the target users.The recommenders'degrees of automation,persistence and personalization have been improved obviously and highly.However,there are many deficiencies in practical application of the traditional collaborative filtering recommendation technology.The existing algorithms are presently limited to the user-item rating matrix,which suffers from sparsity.new users'and new items' cold-start problems.The calculation of neighbours'similarity is not accurate.Neighbours' similarity only thinks about items which users evaluate together,but ignores the correlation and similarity of objective contents about user characteristic and item attribute,et cetera.In addition,the scalability and real-time nature are extremely bad,when they seek the nearest neighbors online in the whole rating matrix space.Moreover,the existing ones have taken users' interests in different time into equal consideration,in result,they lead to the lack of effect in the given period of time.Thus predictive precision is very low,and recommendation quality distorts seriously.According to these problems,this paper proposed a nonlinear combinatorial collaborative filtering algorithm based on user characteristic,item attribute and time weight consequently.Firstly,in order to obtain more accurate nearest neighbour set,it improved neighbours'similarity calculated approach based on correlation among user characteristic and item attribute respectively,by constructing user characteristic matrix and item attribute matrix,to avoid uncorrelated neighbors to disturb the statistics on neighbors'similarity.Secondly,the initial item prediction rating fills in the rating matrix,so makes it much denser.Thirdly,it added time weight to the final user prediction rating,so then highlighted the differences from user's interests in different periods of time,and let users'latest interests and partialnesses take the biggest weight.On the basis of the above improvement,in order to avoid calculation of the neighbor similarity in the entire space of user-item rating matrix,narrow the dimensionality of the nearest neighbor search space,and enhance the algorithms' scalability,and speed up online real-time response velocity.this article designed a personalized recommendation algorithm integrated clustering with collaborative filtering consequently.First of all,in virtue of the Kruskal minimum spanning tree algorithm to optimize the K-Means clustering partitioning method,it advanced Anti-Kruskal-based K-Means clustering algorithm,to preprocess datas offline to construct a minimum cost forest,and generate K clusters and initial cluster centers automatically,so it can overcome the defect of K-Means algorithm's K value and initial cluster means which need artificial random determination,and that different K values and initial cluster centers lead to inconsistent clusters'division.The problem will result in inaccurate nearest neighbors eventually.In addition,taking account of user characteristic and item attribute matrix,and combining Anti-Kruskal-based K-Means algorithm,it improved the neighbour clustering method based on similarity among item attribute and user characteristic respectively.Furthermore,the initial item cluster prediction rating fills in the rating matrix.New users'characteristic similarity and new items'attribute similarity replace the rating similarity.Lastly,it introduced a time function into the final user cluster prediction rating,gave users' actual ratings on items different weight,and scaled the original ratings to reflect the timeliness and efficiency of the latest interests.By the C++programming to realize the two above improved algorithms,the thesis adopted MovieLens data set to do the predictive accuracy experiments and analysis.The three similarity distribution and their mean absolute errors experiments show the distribution of Pearson correlation similarity is the most reasonable,and its MAE is also lower than cosine similarity.The experiment of nearest neighbors search efficiency exhibits the second optimized algorithm can find more nearest neighbors in a smaller space.The experimental results of Comparing the improved algorithms' MAE,Precision,and Recall with the traditional algorithm,demonstrate that the designed algorithms increases prediction accuracy and recommendation performances effectively and significantly,by way of reducing sparsity and cold-start problems,enhancing scalability and real-time nature.Finally,it analyzed their online computation velocity.Comparing with the traditional collaborative filtering algorithm.the time complexity of first improved algorithm remains invariant essentially,and the second advanced algorithm is superior to the traditional one obviously.
Keywords/Search Tags:personalized recommendation, collaborative filtering, clustering, usercharacteristic, item attribute, time weight
PDF Full Text Request
Related items