| Network can serve as a powerful language to represent relational information among data objects from social,natural and academic domains.One way to understand network is to identify and analyze groups of nodes which share same properties or functions.The research task of discovering such groups is known as the community detection problem.Traditional methods mainly focus on finding disjoint communities and are all based on a restrictive assumption that each node can only belong to one single community.By relaxing this assumption,the overlapping community detection problem becomes more general and has attracted major attention recently.There are generally two types of information that can be utilized to discover overlapping communities.The first one is the link structure,i.e.,the presence and absence of edges.The second type of information is the node attribute.Usually overlapping community detection algorithms are only modeled based on the connection structure of the network,but the attributes of each node also play a decisive role.Due to the prevalent noise in link structures,the approaches for detecting community based on both types of information have gained increasing popularity.The most intuitive and important node attributes are text and time information.In this paper,we study the problem of overlapping community detection in temporal text networks.A temporal text network is a directed network in which each node has textual content and temporal information.Such networks are ubiquitous in the real world.Typical representatives include online blog networks,the World Wide Web(WWW),email correspondence networks,and academic citation networks.By examining 32 large temporal text networks,we find a lot of edges connecting two nodes with no common community and discover that nodes in the same community share similar textual contents.This scenario cannot be quantitatively modeled by practically all existing community detection methods.Motivated by these empirical observations,we propose MAGIC(Model Affiliation Graph with Interacting Communities),a generative model based on AGM,which captures community interactions and considers the information from both link structures and node attributes.Our experiments on three types of datasets show that MAGIC achieves large improvements over4 state-of-the-art methods in terms of 4 widely-used metrics. |