Font Size: a A A

Research On Hybrid Collaborative Filtering Recommendation Algorithms Combining Context

Posted on:2017-01-06Degree:DoctorType:Dissertation
Country:ChinaCandidate:K JiFull Text:PDF
GTID:1108330485460315Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the popularization of computer and development of network technology, Internet information services have gradually penetrated into all aspects of people’s life, and are fundamentally transforming people’s traditional lifestyle. Especially in recent years, the widespread use of mobile devices (e.g, smart phone, tablet) and the rise of apps (e.g, wechat, microblog) break through the time and space restrictions of the traditional PC clients, and make it easier, freer and faster for people to obtain and share information via the Internet. But with the flourish of Internet information services, the information resources have exploded. Now, it is very difficult for people to find what they need from the Internet, which is called information overload problem. Against this background, recommendation system is put forward and become one of the most effective methods to solve the problem.At present, collaborative filtering (CF) is the most widely used and successful technology in recommender systems. Given little of historical rating data between users and items, it can quickly build an available system to predict the users’potential need, and has the advantages of simple, easy to use and high accuracy. But, over time, the data size is bigger and bigger, the data type is richer and richer, and the application environment is more and more complex. At this point, traditional CF algorithms are facing more severe problems, such as data sparsity, cold start, scalability and interpretability. Recently, some research that fuses the context into CF has obtained certain performance improvement. These preliminary attempts show that the context and users’interest are closely linked. The introduction of the context can improve prediction accuracy and users’satisfaction, so fusing the context to improve CF algorithms has important research significance. Therefore, this thesis systematically analyzes CF algorithms, has a further study on the context, and then for the rating data with different context, designs various kinds of hybrid CF algorithms of more effectively using the context to overcome the problems current recommender systems faced.The main work and innovations of this thesis are as follows:1. The CF research of fusing the items’classification and content information. At present, most of the research on scalability and cold start problems focuses only on the users, rarely considers the items that are dynamically updated in the system, is still lack of scalability for large-scale items and cannot make good recommendations for new items. We find that given the items’classification, the items belong to the same category will have some of the similar content attribute or other potential characteristics, so the user will have similar interest to them. Based on this, this research starts with items’relations and features, and use the context, such as items’classification and keywords to propose a layered learning CF algorithm, which gradually optimizes users’interest. Analysis shows that it has good scalability for large-scale items, and can solve the cold-start problem of new items. And the experimental results on the real dataset show that it not only achieves higher prediction accuracy on different proportions of sparse data, but also has good cold-start predictions for new items.2. The CF research of fusing the association information between content context. Although items’classification can help to optimize users’interest based on the items’similarity, it needs to be constructed in advance, which limits the scope of the above algorithm. In addition, the above algorithm cannot be scalable for users, and cannot solve the cold start problem of new users. In order to design a more general and scalable CF algorithm, we turn our attention to the content context, i.e., users’content information (label) and items’content information (keyword). User-item ratings can establish the association relationship between their content context. Based on this, this research starts with the content context, and combines CF with content-based method to propose an indirect CF algorithm, which generates the predictions according to the similarity between the content context. Analysis shows that it has strong interpretability and scalability. And the experimental results on the real dataset show that it not only achieves higher prediction accuracy on different proportions of sparse data, but also have good cold-start predictions for both new users and new items.3. The CF research of fusing the potential shared information between the subgroups. Besides directly combining the context with CF, a class of subgroup-based improved algorithms that use the context to divide the whole dataset into subgroups, and then run CF on the subgroups to generate the predictions respectively have appeared in recent years. But, unbalanced sparse data can lead to the problem of unstable CF results on the subgroups. By analyzing the subgroups, we find that there are implicit relations between the users and items of different subgroups. Based on this, this research starts with the potential shared information among the subgroups, and proposes a knowledge transfer-based cross-group CF algorithm, which constructs some approximations of the rating matrix using the CF results from several subgroups with better performance, and then generates the predictions using a weighted sum of the approximations. Analysis shows that it can reduce the unnecessary computation on some subgroups with poor performance. And the experimental results on real dataset show that it improves the prediction accuracy of traditional subgroup-based CF algorithms, especially on the very sparse data, which means it can effectively alleviate the data sparsity problem.
Keywords/Search Tags:Recommender systems, Collaborative filtering, Matrix factorization, Hierarchical classification, Content association, Transfer learning
PDF Full Text Request
Related items