Font Size: a A A

A New Dynamic Clustering Algorithm For Building Overlapping Clusters

Posted on:2016-06-05Degree:MasterType:Thesis
Country:ChinaCandidate:X M WangFull Text:PDF
GTID:2298330467497099Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Social network is a form of social organization based on interactionand influence of the individual members, which is analyzed from theperspective of the human society in the earliest days. With thedevelopment of computer and Internet technology, it gradually evolvedinto a social science and computer science research field. Social networkswidely exist in the real world, such as interpersonal relationship network,research citation network, epidemic spreading networks and the internet.Therefore, mining the laws and characteristics of social network is ofpractical significance. The most significant feature of social networksis the existence of the cluster structure, which is often referred to asthe cluster or community. And the most frequently used method to findthe cluster structure is clustering.Clustering is one of the main methods in data mining, which purposeis to show aggregation characteristics existing in large amount of datathrough the automated method, and to extract the underlying rules. Theresults of clustering are called clusters. Objects in a cluster sharesimilar attributes, while objects in different clusters have differentattributes. Clustering has a long history, and is used in many fields,such as medicine, topic detection and tracking, image segmentation,social network analysis and so on.There are many clustering approaches. According to the basic idea ofthe algorithm, they can be divided into constraint based clustering,hierarchical clustering, clustering based on machine learning, highdimensional data clustering and so on. Most clustering algorithms willdivide one object into one cluster, i.e. different clusters are disjoint. However, in the real world, an individual can belong to multiple groups.For instance, a person is both a member of the family and a member of theorganization; also he can form a circle with his old classmates.Reflecting on the problem of clustering, it means that there will beoverlaps between different clusters and nodes can belong to multipleclusters. Previous studies require no overlaps between clusters and onenode belonging to only one cluster, so this problem has been the concernof many researchers. This paper introduces related overlapping clusteringalgorithms.There are many disadvantages in the current overlapping clusteringalgorithms, such as high computational complexity, too large overlappingarea, unable to process the dynamic changes of data sets, etc. With thedevelopment of Internet technology, especially the arrival of the era ofWeb2.0, there are more interactive behavior between Internet users andnetwork. Some applications such as RSS, blog, news website, twitter, SNSand so on, the information in them changes rapidly, and currentoverlapping clustering algorithms cannot properly deal with this kind ofdynamic change. Based on this, we make improvements on the star subgraphbased overlapping clustering algorithm, and consider the connectiondensity and strength between nodes, and expand the cluster scale, andreduce the overlapping area. Based on the LFR benchmark datasets, theexperiments in this paper use the clustering number, the accuracy, recallrate and F1value of the overlapping nodes, and the normalized mutualinformation value (NMI) as the evaluation criteria, and compare the timeconsumed by updating clusters dynamically. The experimental results showthat the new algorithm can form high quality clusters, and has higheraccuracy and F1value in finding overlapping nodes and higher NMI valuewhen dealing with sparse networks, and consumes less time in dynamicprocessing of nodes adding and deleting. In general, the new proposed algorithm is more suitable for the overlapping clustering and dynamicanalysis of complex networks in the real world.
Keywords/Search Tags:Star subgraph, overlapping clustering, connection strength, dynamic
PDF Full Text Request
Related items