Font Size: a A A

The Study Of Structure-based Graph Classification Algorithm

Posted on:2019-06-26Degree:MasterType:Thesis
Country:ChinaCandidate:W Y ShaoFull Text:PDF
GTID:2428330545970257Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
An increasing number of data has exhibited its new features of strong structuration and intricacy of connection among data,due to the widespread application of data mining in several areas such as information science,bioinformatics,network intrusion detection,etc.Graph,as a kind of complex data structure,can be applied to demonstrate the mutual relationship among objects,meanwhile,the difficulty of the graph data mining procedure is increased because of the complexity of the graph structure.Former methods of the data mining mostly tend to focus on vector data,however,these methods present their limitations when being directly applied in graph-structured data.Therefore,researches concentrating on graph data mining methods have become more of an important issue while specialists expect to solve new issues of graph mining by counting on the experience and the methods of structured data mining.Graph classification is one of the most important branches in the field of graph mining.This paper analyzes and researches the structure-based graph classification methods from two dimensions,graph kernel function and eigenvector construction respectively,by comprehensively utilize the knowledge of graph theory and the technology of data mining.The specific research contents of this paper are as follows:(1)Graph classification based on graph set reconstruction and graph kernel feature reduction.Traditional graph classification based on graph kernel firstly uses graph kernel function in order to map the graph data to high-dimensional vector space.Then the graph data can be categorized by the common classifier.However,the process of feature selection on graph data is ignored during this procedure,which leads to the redundancy of graph that reduces the performance of the classification model.To reconstruct the graph data set and narrow down the scale of the graph data,this paper primarily considers the discriminative power of frequent subgraphs and infrequent subgraphs in the meantime and removes the subgraphs with low discrimination from the original graph data set.Secondly,the graph kernel function is used to map the new reconstructed graph data set to the high-dimensional feature space.Afterward,the kernel discriminant analysis is applied to reduce the dimension of the high-dimensional vector,then the graph data can be vectorized while the category information is contained in this vector.At last,the graphs can be categorized by the common classifiers.(2)Graph classification based on graph structure embedding.Traditional graph classification based on graph feature vector construction requires selecting a sort of construction standard of feature vector proactively.For instance,the standard based on the theoretical index or the occurrence number of graph topological structure,etc.Then the graphs can be transformed to the vectors by extracting features from each graph in the dataset following the designated standard.However,this construction method of graph feature vector can simply result in the loss of the structure information of the graph and needs strong professional knowledge.Inspired by the Word2Vec and Doc2Vec model in NLP,a "word list" which is consisted of subgraphs and designed for graph data is built firstly.Then a neural network which is utilized to train the graph embedding is designed.The graph itself is used as its input,while the "vocabulary" in the graph and the attribute characteristics act as the output of the neural network.So that the graph embedding corresponding to each graph can be learned automatically.And the graph embedding not only reflects the features of the graph itself but also contains the relative relation among graphs.Finally,the common classifiers can be used to classify the graphs based on the well-trained graph embedding.
Keywords/Search Tags:graph classification, discriminative subgraphs selection, graph kernel, graph embedding
PDF Full Text Request
Related items