Font Size: a A A

Research On Graph Structure Data Reduction Method Based On Rough Set Theory

Posted on:2022-07-16Degree:MasterType:Thesis
Country:ChinaCandidate:N SuFull Text:PDF
GTID:2480306755950249Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Traditional attribute reduction algorithms usually default to the assumption that samples are independent and identically distributed.However,there are many real-world cases where objects have inherent dependencies,i.e.,do not satisfy the assumption that samples are independent and identically distributed,and this form of data can be easily represented by graph structures.Our paper focuses on data with a priori dependencies,i.e.,graph-structured data,and studies its reduction algorithm in the context of node classification tasks.We start from two aspects to study the reduction of graph-structured data.The first aspect is the reduction of graph structure and the second is the reduction of node attributes.The problem of graph structure reduction concerns how to aggregate the nodes(edges)of the original graph while maintaining some properties of the original graph so as to construct a summary graph with fewer nodes and a more compact structure.For this problem,several techniques have been proposed in the field of graph summarization that can be implemented.However,none of them consider the node classification quality of the original graph.If these algorithms are directly used as preprocessing for the node classification task,they will have a very bad impact on the downstream node classification model.Therefore,we design a novel clustering-based graph summarization technique,which uses rough set theory and fuzzy cluster analysis to identify and remove redundant or irrelevant structures in the original graph and obtain a core representation of the original graph.The algorithm constructs the summary graph by merging nodes with identical or similar neighborhoods,while maintaining the loss of node classification quality within a tolerable range.Specifically,first,for a node in the graph,we compute the affinities of this node to all other nodes,and these affinities form a vector that represents the neighborhood structure of this node.We use the neighborhood structure to measure the structural similarity relationship between nodes.Second,based on this relationship and the known label information,with the help of fuzzy cluster analysis and rough set theory,we can determine the equivalence relationship between nodes regarding the graph structure and the known labeling information.Finally,we merge the equivalent nodes and thus construct a summary graph.Our experiments are conducted on three real-world datasets.The results show that our proposed algorithm has the ability to adaptively adjust the reduction rate compared to other graph summarization algorithms,thus achieving better performance in node classification tasks.In addition,although our algorithm simplifies the information of the original graph,the summary graph has better performance than the original graph in many cases,which indicates that our algorithm can remove the noise information of the original graph.The problem of node attribute reduction concerns how to conduct node attribute reduction more efficiently with the help of inherent priori relationships(graph structure)between nodes.This problem can be decomposed into two simple problems: one is how to extract the equivalence relation on the set of nodes from the inherent priori relations of nodes,and the other is how to combine this equivalence relation with the node equivalence relation derived from the node attributes.For the first problem,three solutions are given in this paper,i.e.,an approach based on second-order similarity of nodes,an approach based on unsupervised graph representation learning,and an approach based on spectral clustering.The first two approaches aim to obtain a node vector representation that can represent the nodes' a priori relations,and then equivalence relations can be computed based on this representation.The last approach aims to obtain directly the partition of the node set and thus derive directly the equivalence relations.For the second problem,we propose a general framework for combining equivalence relations based on node prior relations with equivalence relations based on node attributes.We conduct experiments on real-world datasets,and the experimental results demonstrate the effectiveness of our proposed approach.
Keywords/Search Tags:graph summarization, attribute reduction, rough set theory, network embedding, graph neural network, semi-surprised learning, node classification
PDF Full Text Request
Related items