Font Size: a A A

Clustering Of Heterogeneous Information Networks Based On Nonnegative Matrix Tri-factorization

Posted on:2017-01-07Degree:MasterType:Thesis
Country:ChinaCandidate:H X LiFull Text:PDF
GTID:2348330488959958Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Heterogeneous information network (HIN) is a complicated network which is constituted by multi-type objects interconnected with each other via multi-type relations. Many kinds of real world data can be modeled by HIN. In most of the current research on network science, social and information networks are usually assumed to be homogeneous, where nodes are objects of the same entity type. Since HIN carries more information than homogeneous information network, the study of HIN has been a popular theme of research in recent years.Clustering plays an important role in mining knowledge from HIN. Several HIN clustering algorithms have been proposed. However, these algorithms suffer from one or more of the following problems:(1) inability to model general HIN; (2) inability to simultaneously generate clusters for all types of objects; (3) inability to use information of the same type.In this paper, we propose a powerful HIN clustering algorithm which can handle general HIN, simultaneously generate clusters for all types of objects, and use information of the same type. First, we transform a general HIN into a meta-path-encoded relationship set. Second, we propose a multi-type Clustering method, HMFClus, to cluster all types of objects in HIN simultaneously. Third, we integrate the information between the objects of the same type into HMFClus by using a similarity regularization, i.e. HMFClus-S. Extensive experiments on real world datasets show that the proposed algorithm outperforms the state-of-the-art methods.
Keywords/Search Tags:Datamining, Heterogeneous Information Network, Co-Clustering, Nonnegative Matrix Factorization
PDF Full Text Request
Related items