Font Size: a A A

Non-IID Recommender System

Posted on:2015-04-01Degree:DoctorType:Dissertation
Country:ChinaCandidate:F F LiFull Text:PDF
GTID:1228330422993441Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In information age, the rapid growth of Internet brings users large amounts ofinformation satisfying users’ need of information. However, when facing billions of data, itis difficult to get the real valuable information, which limits the efficiency of usinginformation. This is the well-known information overload problem. RS is proposed to helpusers tackle information overload by suggesting potentially interesting items to users. Atypical RS usually has a set of users and items with each user u rating various items bydifferent preferences. The key task of RS is to predict the unknown rating or to recommendrelevant items for the given user. However, traditional recommendation techniques such asCollaborative Filtering (CF) or Matrix Factorization (MF) normally rely on the user-itemrating matrix only, causing the cold start and sparsity problems. In this dissertation, we aimto solve the above two problems, and propose series of non-iid based recommendationmodels. The key idea is to leverage complementary information, such as social friendships,group or community information, user attributes, item properties, in RS. We mainly focuson analysing the coupling relationships within such useful information, then we proposeseveral novel algorithms including coupled item-based matrix factorization, hybrid coupledmatrix factorization integrating user couplings and item couplings together, and coupledgroup-based matrix factorization. The theoretical and experimental analysis demonstratesthe improvement of incorporating such information into RS. Besides, regarding the groupformation method, we also proposed a new theoretical clustering analysis framework whichintegrates the supervised learning together. Generally, the significances and contributions ofthe dissertation are as follows:(1) We propose a Coupled Item-based Matrix Factorization (CIMF) model. Theextant recommendation methods which incorporate the relationships between itemattributes assume that different attributes are identically and independently distributed (IID).In fact, the attributes are more or less coupled with each other by some implicitrelationships. Based on this, we incorporate the objective item attributes as complementaryinformation with a non-IID manner to solve the essential problems of RS. We propose an attribute-based coupled similarity measure to capture the implicit relationships betweenitems. We then integrate the implicit item coupling into MF to form the CIMF model.CIMF considers the users’ rating preferences on items and items’ coupling relationships in anon-IID way, and partly overcome the cold start and sparsity problems. Experimentalresults on MovieLens and Book-Crossing data sets demonstrate that CIMF outperforms thetraditional CF and MF methods.(2) We propose a Coupled Matrix Factorization (CMF) framework. Similar to itemproperties, user attributes are also coupled with each other by some implicit relationships,which are helpful for solving the cold start and sparsity problems. Therefore, it is alsosignificant to consider the user couplings in non-IID RS. We first analyse the implicit usercouplings, and integrate them together with the above item couplings. Based on the coupledanalysis for users and items, we then propose a novel generic Coupled Matrix Factorization(CMF) framework by incorporating the coupling relations within users and items. Suchcouplings integrate the intra-coupled interaction within an attribute and inter-coupledinteraction among different attributes to form a coupled representation for users and items.CMF model is beneficial for solving the cold start and sparsity challenges. Experimentalresults on MovieLens and Book-Crossing data sets demonstrate that the user/item couplingscan be effectively applied in RS and CMF outperforms the corresponding comparisonmethods.(3) We propose a Coupled Group-based Matrix Factorization (CGMF) method. Inaddition to the above couplings within users and items, users’ friendships and groupinformation are also helpful for making a better recommendation in RS. Actually, more andmore researchers have been trying to incorporate social friendships into RS, and theunderlying assumption of social recommendation is that a user’s taste is similar to his/herfriends’ in social networking. In fact, users enjoy different groups of items with differentpreferences. A user may be treated as trustful by his/her friends more on some specificrather than all groups. Unfortunately, most of the extant social RS are not able todifferentiate user’s social influence in different groups, resulting in the unsatisfactoryrecommendation results. Moreover, most extant systems mainly rely on social relations, butoverlook the influence of relations between items. Therefore, the innovative CGMF method is proposed by leveraging group information and the relationships within users and items.User’s social friendships are helpful for overcoming the cold start problem, and the groupinformation is beneficial for remedying data sparsity problem. Experiments conducted onMovieLens, Last.Fm and DBLP data sets demonstrate the effectiveness of our approach.(4) We propose a CSAL clustering analysis framework which integrates supervisedlearning. To solve the sparsity problem and reduce the computation time in non-IID RS, asignificant research problem is group formation. Besides the topic modeling for forminggroups, another good way is clustering or classification analysis. Therefore, we propose anew clustering framework for solving the essential and challenging absent labels problemexisted in most traditional clustering and classification algorithms. In this framework,clustering is first employed to partition data and a certain proportion of clustered data areselected by our proposed labeling approach for training classifiers. In order to refine thetrained classifiers, we also devise an iterative process of Expectation-Maximizationalgorithm. CSAL framework overcomes the essential absent labels problem in traditionalclustering and classification methods. Experiments are conducted on publicly data sets totest different combinations of clustering algorithms and classification models as well asvarious training data labeling methods. The experimental results show that our approachalong with the self-adaptive method outperforms traditional clustering and classificationmethods.
Keywords/Search Tags:Recommender Systems, Matrix Factorization, Attribute Analysis, CoupledRelation Analysis, Social Recommendation, Clustering Analysis
PDF Full Text Request
Related items