Font Size: a A A

Collaborative Filtering Recommendation Algorithm Based On Probability Matrix Factorization And Spectral Clustering

Posted on:2022-12-01Degree:MasterType:Thesis
Country:ChinaCandidate:L WangFull Text:PDF
GTID:2518306743979389Subject:Master of Applied Statistics
Abstract/Summary:PDF Full Text Request
With the approach of the period of big data,data is growing exponentially.Data is not only manifested in a large amount of data,but also presents the problem of "data redundancy".The recommendation system can well obtain the information that the user is interested in from the massive data information,so as to better generate the user recommendation list.At present,recommendation systems have been widespreadly applied in the commercial areas,which can not only tap potential commercial value,but also better meet the personalized needs of users.Collaborative filtering recommendation algorithm is the most basic algorithm in recommendation system,and there are three main problems.The "data sparsity" problem is one of the main problems faced by collaborative filtering recommendation algorithms.The main principle is that the user-item rating matrix is relatively sparse,that is,only some people rate some items,resulting in a small amount of data,which affects the accuracy of the recommendation results.The "cold start" problem,a new user or item appears in the system,will directly lead to inaccurate recommendation results because there is no record of the relevant user or item in the system before.The "Scalability" problem,with the gradual addition of new users and projects,the amount of data gradually increases,and whether the existing recommendation algorithm can better perform real-time recommendation has also become the main problem it faces.In order to solve the above problems and generate a better personalized recommendation list,the main assignment of this thesis is as follows:First,to alleviate the "data sparsity" problem,this paper uses probabilistic matrix factorization to populate the user-item sparse matrix.In this paper,the Movie Lens100 K data is devoted to the experimental dataset,and the four methods of probability matrix decomposition,global average value,Slope one and non-negative matrix decomposition are used to fill the sparse matrix,and the mean square error(RMSE)is used as the evaluation index.The results show that: the probability matrix The decomposition had the lowest RMSE at 0.9177.The results show that filling the sparse matrix with probability matrix factorization is the best.Second,in order to better perform personalized recommendation,we perform spectral clustering on the filled user-item matrix to narrow the search range of the target user's nearest neighbors,give a more accurate neighbor interval,and reduce the search range.Then,the traditional collaborative filtering recommendation algorithm is performed within the class,the user similarity is calculated,and the predicted score is obtained.Finally,to verify the effect of the algorithm in this paper,we take the public dataset Movie Lens100 K as the experimental dataset,and use the root mean square error and the mean absolute error as the evaluation indicators,and conduct five sets of experiments: Experiment 1:Determination of the regularization parameter ? of the probability matrix decomposition,with RMSE as the evaluation index,the results show that when ?=0.1,the RMSE is the smallest,and the subsequent experiment ? is always 0.1;Experiment 2: The number of potential features of the probability matrix decomposition is determined,when The number of iterations is 50,and the number of potential features is 5;Experiment 3: Probabilistic matrix decomposition integrates different clustering algorithms,compares the results generated by different numbers of clusters,and uses MAE as an evaluation index to determine the optimal number of clusters;Experiment 4: The RMSE values of the probabilistic matrix decomposition fusion clustering algorithm are compared,respectively,the fusion spectral clustering algorithm and the K-Means algorithm are compared with CF.The consequences show that the fusion spectral clustering algorithm has the lowest RMSE value,indicating that the fusion spectral clustering algorithm The effect of the class algorithm is the best,and it can availably improve the cold start problem;Experiment 5: Based on the determination of the above four experimental parameters,output the predicted score under different number of neighbors,which is different from collaborative filtering recommendation algorithm and the probability of no clustering.The collaborative filtering of matrix factorization(PMF)is compared with the improved algorithm(PMF?SC)in this paper,and RMSE and MAE are used as evaluation indicators.The RMSE has decreased,indicating that the algorithm has effectively improve accuracy,which has a certain reference significance.
Keywords/Search Tags:Recommendation Algorithm, Probability Matrix Factorization, Spectral Clustering, Collaborative Filtering Algorithm, Sparsity
PDF Full Text Request
Related items