Font Size: a A A

Research On Key Problems Of Collaborative Filtering Recommendation Algorithms

Posted on:2017-05-27Degree:DoctorType:Dissertation
Country:ChinaCandidate:S S HuangFull Text:PDF
GTID:1108330488451930Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
In recent decades, with the development of information technology and Web 2.0, the explosive growth of information results in the information overload phenomenon. Recommender system is an effective tool to overcome the information overload problem. It mines user interests from user historical behaviors, and actively makes recommendations to users which can meet their interests and needs. Nowadays, recommender systems have been widely applied in many online websites, such as the commodity recommendation in Amazon, video recommendation in Netflix and YouTube. In academia, many different types of recommendation algorithms have been proposed, in which collaborative filtering (CF) becomes the most popular class of recommendation algorithm by virtue of its advantages. Although collaborative filtering has achieved great success in personalized recommendation, some key problems restrict its further development.In this dissertation, based on existing work, we carried out a series of research endeavors for improving the data sparsity problem, scalability problem and Top-n recommendation problem in collaborative filtering algorithms under the support of NSF. The main research contents and innovations are listed as follows:(1) We put forward a new collaborative filtering algorithm with Linked Data.Traditional matrix factorization recommendation algorithm cannot accurately learn user and item latent factors due to the data sparsity problem. In this work, we utilize the high-quality data from Linked Data to mitigate the effect of data sparsity problem in matrix factorization. At first, we measure the similarities between items based on the structured attributes of items from Linked Data. Then we propose two kinds of item similarity sensitive matrix factorization recommendation algorithms. We assume that the item latent feature vectors should be close if items have similar attributes, which breaks the independence of items in matrix factorization. Experimental results show that our algorithm can effectively cope with the data sparsity problem and can still generate recommendation with high accuracy even for items with few ratings.(2) We raise a bipartite graph recommendation algorithm based on user groups.Aiming at the data sparsity problem and scalability problem in bipartite recommendation algorithm, we applied clustering technique in user clustering. Specifically, we first use dimension reduction technique of singular value decomposition (SVD) to obtain the users’feature space. Taking into account the diversity of user interests, we use fuzzy c-means clustering algorithm to divide users into multiple user groups, and each user can belong to more than one user groups. Based on the resulting user groups, the original bipartite graph can be split into multiple denser and relatively smaller sub-graphs. The amount of computations of recommendation will be greatly reduced on the sub-graphs. The experimental results show that different from previous clustering collaborative filtering recommendation algorithms, which improve recommendation efficiency by damaging recommendation accuracy, our method can improve the recommendation scalability while ensuring recommendation accuracy.(3) We propose a hybrid multi-group coclustering recommendation framework based on information fusion.In the field of recommender systems, most of the previous clustering-based CF models only utilize historical rating information to clustering users or items. However, due to the data sparsity problem, the clustering may not be effective. To cope with the above problem, we integrate user-item rating data, user-user social relations and item-item associate information, based on which a new hybrid multi-group coclustering method is proposed. The proposed clustering method can simultaneously cluster users and items and each user and item can belong to multiple groups. Afterwards, the original rating matrix is divided into multiple sub-matrices based on the resulting user and items groups. CF recommendation algorithms can be used to generate intermediate recommendation results. Finally, we aggregate the intermediate recommendation results to make final recommendations. The experimental results show that our clustering method has superior recommendation accuracy compared to previous clustering methods, and it can also mitigate the data sparsity and scalability problems.(4) We present a novel listwise collaborative filtering recommendation algorithm.Aiming at the goal of Top-n recommendation, the presented algorithm omits the rating prediction procedure and directly predicts the item ranking. Specifically, we first transfer the each user’s ratings into probability distribution over the permutations of item set by means of the Plackett-Luce model and we measure the item ranking similarities between users based on the Kullback-Leibler divergence. Then we define a weighted cross entropy loss function with the user similarities. The loss function is minimized by the gradient decent method and the item ranking is predicted for each target user. In order to improve the practicability and the efficiency of the algorithm, we propose incremental updating methods for computing user similarities and the computational time is greatly reduced. Experimental results on three benchmark datasets show that our algorithm is much more efficient than pairwise collaborative filtering and has higher Top-n recommendation accuracy compared to state-of-the-art recommendation algorithms.
Keywords/Search Tags:Recommender System, Collaborative Filtering, Data Sparsity, Scalability, Top-n Recommendation, Clustering Technique
PDF Full Text Request
Related items