Font Size: a A A

Research And Application Of Recommendation Algorithm Based On Dimension Reduction And Clustering

Posted on:2021-03-14Degree:MasterType:Thesis
Country:ChinaCandidate:X ChenFull Text:PDF
GTID:2428330614466009Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the popularity of the Internet,the number of network users and the amount of data carrying various types of information are increasing rapidly.Faced with an unprecedented amount of data,how users can obtain the information they are interested in,and how Internet service providers can make their services stand out have become urgent issues that need to be resolved.A personalized recommendation system emerges as the times require.The system uses a recommendation algorithm to analyze the user's historical browsing,purchase,and evaluation information,and then recommends information that the user may be interested to the user.The recommendation algorithm is the core of the recommendation system.This thesis studies and improves the traditional collaborative filtering recommendation algorithm,and designs a new collaborative filtering recommendation algorithm PK-CF based on PCA dimensionality reduction and improved K-means clustering.In order to solve the problem of similarity calculation errors caused by extremely sparse user-item scoring matrices,this algorithm uses principal component analysis to reduce the dimensionality of user-item scoring matrices,retaining only the dimensions that best represent user characteristics.In order to solve the problem that the similarity calculation of the collaborative filtering algorithm takes a long time in a large system,K-means clustering is performed on the low-dimensional vector space after dimensionality reduction to reduce the search range of the target user's nearest neighbors,and a new initial centroid selection algorithm based on k-dimensional tree is designed to improve the K-means algorithm,it can ensure the final clustering effect and accelerate the clustering speed.In order to further improve the real-time and scalability of the recommendation system,the parallel scheme of PK-CF algorithm on the main big data platform Spark is designed,and the parallelization of the improved k-means algorithm and the process of prediction and scoring are realized.In this thesis,the performance of traditional collaborative filtering algorithm,the K-means clustering based collaborative filtering algorithm and the PK-CF algorithm are tested with the Movie Lens dataset.The results show that the PK-CF algorithm can effectively improve the accuracy and recall rate of the recommendation results,and has higher time efficiency.Finally,this thesis applies the PK-CF algorithm to the music recommendation business scenario,and develops a music recommendation system to further test the practicability of the PK-CF algorithm.
Keywords/Search Tags:Personalized recommendation system, Collaborative filtering, Principal component analysis, K-means clustering
PDF Full Text Request
Related items