Font Size: a A A

Community Detection In Heterogeneous Social Networks

Posted on:2017-05-28Degree:DoctorType:Dissertation
Country:ChinaCandidate:T WangFull Text:PDF
GTID:1220330482981410Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Community detection(CD) of real-world social networks has become one of the hottest research fields in Social Networks and Data Mining in recent years, especially analyzing large-scale social network data is getting more complicated and challenging. With the development of Web 2.0 and online social networks, there are many activities of online social interaction, single consider the community structure of homogeneous social network is not enough. So, the concept of Heterogeneous Social Networks is proposed, it’s an abstract of complex network, it concludes many kinds of relations or multi-dimensional networks and many kinds of entities or multi-mode networks. How to deal with the complex structure and finding the community structure in heterogeneous social networks, that is a new challenge different from the traditional community detection in social networks. This thesis is focus on the multi-dimentional networks and multi-mode networks, to solve the problem of finding communities in heterogeneous networks. Introduce the sematic information of the heterogeneous networks, to help finding communities. Propose the unsupervised finding community algorithms,which don’t need the prior knowledge of the number of communities. This thesis proposes a common analysis framework of heterogeneous social networks, and label propagation based community detection algorithms, and topic model based community detection algorithm, to solve the problem of detecting community structure of heterogeneous social networks fast and effective.1) Propose a common analysis framework of heterogeneous social networks, from network transformation and dimentional reduction, transform the heterogeneous network to homogeneous network or bipartite graph, then using community detection algorithm on homogeneous network or bipartite graph, successfully transform the detection problem on heterogeneous social networks.2) After tranforming heterogeneous network to homogeneous network, propose a Parallel Hybrid Seed Expansion(PHSE) algorithm to find overlapping community in social networks. In order to get nature communities, the local optimization of the fitness function and greedy seed expansion with a novel hybrid seeds selection strategy were employed. What’s more, to get a better scalability, a parallel implementation of this algorithm was provided in this paper. Significantly, PHSE has a comparable performance than LFM on both synthetic networks and real-world social networks, especially on LFR benchmark graphs with high levels of overlap. Especially PHSE can accurately detecting communities on synthetic networks with high overlapping degree on the fraction of overlapping nodes On = 50%.3) Propose an Improved Speaker-listener Label Propagation Algorithm(iSLPA), an efficient near-linear method for community detection. It can automatically work on three kinds of networks: directed networks, undirected networks, and especially bipartite networks. It proposes a new initialization and updating strategy to improve the quality and scalability for detecting communities. And it conducts experiments on real-world social networks datasets on both benchmark networks and Douban user datasets. Experimental results demonstrate that iSLPA has a comparable performance than SLPA, and have confirmed that it is very efficient and effective on the overlapping community detection of large-scale networks.4) Propose a Hybrid Label Propagation Algorithm(HLPA) for finding communities on large-scale real-world social networks. And it uses a different label initialization strategy and a novel hybrid label updating strategy for detecting on the different networks such as directed network, undirected network and bipartite network. The HLPA is using a label decaying strategy to avoid “monster” communities, makes the smaller communities to fully grow. Compare with previous label propagation based methods, HLPA performs with very highly accuracy. This method also can get detection results on large-scale networks significantly fast. Through experiment on large-scale real-world social networks, compare HLPA algorithm with the state-of-art algorithms, have confirmed its superiority and universality. It only needs 37.12 minutes to run on a 3 million nodes and 0.17 billion networks and confirmed the community structure detected from the algorithm is meaningful.5) Based on the common analysis framework of heterogeneous social networks in 1), this thesis proposes a topic-aware based community detection on heterogeneous social networks. It transforms the multi-mode network to two-mode network(user-document), using LDA-light algorithm to mapping the two-mode network to weighted bipartite graph(user-topic), then using the proposed Weighted-LPA(WLPA) to detecting the communities on bipartite graph. Finally, it gets the community results which have both entities of users and topics, that means the result of communities is having sematic information, it can help to analyze and understand the community structure on heterogeneous social networks better.The algorithms proposed in this thesis can be commonly used and extended to many heterogeneous social networks and data sets, they can also adapt to many real-world problems.
Keywords/Search Tags:community detection, heterogeneous social networks, label propagation algorithm, topic model
PDF Full Text Request
Related items