Font Size: a A A

Research On K - Means Clustering And User 's Interest Change Based On Kruskal Algorithm

Posted on:2016-11-29Degree:MasterType:Thesis
Country:ChinaCandidate:L Q WangFull Text:PDF
GTID:2208330470970756Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The rapid development of information technology and the Internet has promoted the rapid growth of information resources. Thus it appears the information is seriously overloaded, causing a huge waste of time for users to find the information they need in the massive amounts of information. And the emergence of recommendation system to which provides personalized recommendation according to the user’s need effectively alleviates the information overload. But the quality of the recommendation system is strongly affected due to many widespread problems like data sparsity, cold start and the real-time problem. It has become a research hotspot in current investigation of recommendation system that how to find a way to deal with the above problems effectively, which is consistent with the research direction in this thesis.Firstly, based on the recommendation system, we carried out the personalized recommendation algorithm and utilized related technology to analyzed and compared the data. Aiming at the recommendation system user item rating matrix sparsity problem, this thesis adopts an improved filling algorithm based on prediction score project was pre filled to the original score matrix, solves the problem of sparse data. In order to improve the real-time recommendation system, this thesis uses clustering techniques and introduces K-means clustering algorithm in detail to achieve the traditional theory, analyzes the advantages and disadvantages of the algorithm. In view of sensitiveness of the traditional K-means clustering algorithm to the initial clustering center problem, this thesis proposes the use of Kruskal algorithm by constructing a minimum spanning tree (MST) method to realized the automatic generation of initial clustering center of the uniform distribution, which solved the problem of traditional K-means algorithm. Finally, we combined Kruskal algorithm with the k-means algorithm to improve the rating matrix after filling was offline clustering processing and effectively improved the real-time recommendation system.Secondly, considering of the passage of time, the user’s interest may has changed. This thesis assumes that the recent project score of users reflects more the current interest of users and closer the user’s score time approaches means that the greater the similarity. Through user or item based collaborative filtering algorithm the calculation of similarity based on time, according to the utility function of the different ratings are assigned different utility value. In order to improve the accuracy of similar neighbors at the same time, the algorithm solves the item cold start and user cold start problem, which ultimately improve the recommendation accuracy.Finally, for the purpose of verifying the validation of the proposed algorithm, the algorithm proposed in this thesis and the traditional collaborative filtering algorithm was analyzed and the comparative experiments were carried out, respectively. The experimental results show that the algorithm proposed in this thesis’s recommendation quality is obviously superior to the traditional algorithm.
Keywords/Search Tags:recommendation system, K-means Clustering, Kruskal algorithm, User interest change
PDF Full Text Request
Related items