Font Size: a A A

Uncertain Graph Data Clustering Methods Based On The Theory Of Belief Functions

Posted on:2020-11-27Degree:DoctorType:Dissertation
Country:ChinaCandidate:K ZhouFull Text:PDF
GTID:1488306740471864Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Many real systems in our daily life can be represented by complex networks,such as social network,biological network,traffic network and so on.The nodes in the network denote patterns in the systems,while the edges describe the relations between different patterns.It has already be known that there are some small classes(often called communities)in many real-world networks.Community structure is one of the most important properties of complex networks.Generally speaking,a community is a subset of nodes that are closer to other nodes in the same group but very different from those in other groups.Therefore,community detection can be seen as a clustering problem applied on relational data sets.In practice,there is usually a lot of uncertain information included in networks,such as the uncertain structure due to the imprecise relations,the incomplete attribute information caused by the missing information,and the uncertain prior knowledge.This brings lots of challenges to the graph data clustering problem.The Dempster-Shafer theory,as an extension of the probability theory,can deal with many kinds of uncertain information.This work will discuss the uncertain graph data clustering problem in the framework of belief functions.The main contributions of this thesis are listed as follows.·We start with a basic situation where only the dissimilarities between patterns are known,and propose a new (8-partition clustering approach named Median Evidential-Means(MECM).This approach does not require to calculate the geometrical center which is not possible in relational data sets.One optimal sample is regarded as the class prototype,and the median clustering method is extended in the framework of belief function theory.Moreover,a community detection scheme based on MECM is presented.In addition,some practical issues about how to apply the method into community detection problems such as how to determine the initial prototypes and the optimum community number in the sense of credal partitions are discussed.This makes the approach appropriate for graph partitions and enables us to gain a better understanding of the analysed networks,especially for the uncertain class structure.·In some cases the way of using only one node to describe a community may not be sufficient enough.In order to capture various aspects of the community structures,more members rather than one should be referred as the prototypes of an individual group.Motivated by this idea,a Similarity-based Multiple Prototype(SMP)community detection approach is proposed.The centrality values are used as the criterion to select multiple prototypes to characterize each community.The prototype weights are derived to describe the degree of representativeness of objects for their own communities.Then the similarity between each node and community is defined,and the nodes are partitioned into divided communities according to these similarities.Crisp and fuzzy partitions could be obtained by the application of SMP.·Following,we extend SMP in the framework of belief functions to get credal partitions so that we can gain a better understanding of the data structure.The prototype weights are incorporate into the objective function of evidential clustering.The mass membership and the prototype weights could be updated alternatively during the optimization process.In this case,each cluster could be described using multiple weighted prototypes.As we will show,the prototype weights could also provide us some useful information for structure analysis of the data sets.·In real practice,there is usually some prior knowledge about the community labels of some individuals.We extend the original update rule and propagation criterion of the label propagation algorithm(LPA)in the framework of belief functions to take use of the limited supervised information.A new community detection approach,called Semi-supervised Evidential Label Propagation(SELP),is proposed as an enhanced version of the conventional LPA.The community labels are propagated from the labeled nodes to the unlabeled ones step by step according to the proposed evidential label propagation rule.Moreover,the SELP algorithm is extended and well applied on general relational data sets.The experimental results show that SELP performs better than the unsupervised community detection methods even though the amount of supervised information is small.
Keywords/Search Tags:Evidence theory, complex networks, relational data, uncertainty, community structure
PDF Full Text Request
Related items