Font Size: a A A

Research On Community Detection Algorithms Based On Vertex Distance And Clustering Methods

Posted on:2017-02-13Degree:MasterType:Thesis
Country:ChinaCandidate:Y LiFull Text:PDF
GTID:2308330482489812Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Social networks are interactive network of relationships between individuals. With the rapid development of Internet in recent years, the type of social network is becoming more and more diverse, and social network contains more and more valuable information. Community detection within social networks is one of the most preliminary areas of research over the past decade. Community detection has been widely used in the analysis of protein functions, user’s behavior analysis, network anomaly detection and many other fields, because community detection can find the common information in social networks.Generally, the community is a collection of the closely interrelated individuals, and the individuals between the communities interrelate relatively sparse than the individuals within the community. A social network is usually abstracted as a graph, which vertexes represent individuals and edges between the vertexes represent the interactions between individuals. Mining community structure in the social network can be treated as dividing a whole graph into several sub-graphs according to interaction relationship between vertexes. Vertexes in the same sub-graph are very similar, and vertexes between sub-graphs are dissimilar, so we can treat community detection problem as a clustering problem in graph.Owing to the features of clustering problem, the main challenges of this work are as follows:1. How to effectively measure the distance between vertexes.2. Based on the distance between vertexes, what kind of clustering algorithm should be used to cluster vertexes in social networks?For these two questions, in this paper, we solve them as follows:1. For the vertexes distance measuring challenge, we analysis the deficiency of cosine distance and Jaccard distance, and introduce the length of the shortest path between vertexes in network. And then we respectively integrate the length of the shortest path withcosine distance and Jaccard distance, and propose the improved cosine distance and the improved Jaccard distance, which measure the distance between vertexes much better in the network. 2. For the clustering algorithm selecting challenge, we select density peaks clustering method and agglomerative hierarchical clustering method, which are distance-based clustering methods. Density peaks clustering method can determine community centers and the number of communities without parameters selection. Agglomerative hierarchical clustering method can merge vertexes constantly based on vertex distance, and discover the hierarchical structure in the network. In summary, the main works of the paper are as follows: 1. In this paper, we propose two community detection algorithms based on the improved vertex distance measure and density peaks clustering method, CSDPC and JSDPC. CSDPC and JSDPC use the improved cosine distance and the improved Jaccard distance to measure the distance between vertexes respectively, and can determine key vertexes in networks and the number of communities, which has a big advantage comparing to the traditional methods. 2. We propose a community detection algorithm based on the improved vertex distance measure and agglomerative hierarchical clustering method, CSAHC. CSAHC uses the improved cosine distance to measure the distance between vertexes, and then agglomerative hierarchical clustering method merge vertexes constantly, discover hierarchical community structure. Finally, we formulate the different community result selection strategy, according to the difference of network structures. 3. The experiment result on both real world network and synthetic network demonstrate the feasibility and effectiveness of our proposed methods.
Keywords/Search Tags:Community Detection, Vertex Distance, Density Peaks Clustering, Agglomerative Hierarchical Clustering
PDF Full Text Request
Related items