Stochastic graph grammars: Parameter learning and applications

Posted on:2011-05-10

Degree:Ph.D

Type:Thesis

University:University of Maryland, Baltimore County

Candidate:Mukherjee, Sourav

Full Text:PDF

GTID:2448390002968950

Subject:Computer Science

Abstract/Summary:

Although traditional data mining algorithms often assume the data to have a feature-vector representation, emerging domains for data mining such as bioinformatics and social networks require the underlying data to be represented as graphs. In this thesis, we present stochastic graph grammars as probability models suitable for mining graph databases.;We first consider the problem of parameter learning of stochastic graph grammars from training data, and observe that the only known solution to this problem, PEGG (Parameter Estimation for Graph Grammars), relies on the bottom up parsing of the training instances. Parsing can be prohibitively expensive for many real grammars. To overcome this difficulty, we present the QuickPEGG algorithm for learning the grammar parameters more efficiently. QuickPEGG utilizes the fact that for most grammars, the probability distributions of the number of nodes and the number of edges of the underlying graph distribution contain sufficient information to infer the grammar parameters, thus rendering parsing unnecessary for parameter learning.;We then consider applications of the learned grammar to knowledge discovery in the underlying domain. In the existing literature, most applications of graph grammars are dominated by sampling and parsing. However, we show that once the stochastic graph grammar has been learned, interesting statistics such as the degree distribution may be inferred from it. Because of this property, stochastic graph grammars may be used as generative models for useful families of networks, such as scale-free networks where the degree of a node follows a power law distribution.;In summary, our work establishes stochastic graph grammars as a framework suitable for mining graph databases; the algorithms presented in this thesis improve our ability to efficiently learn grammar parameters from training data. The applications shown in this thesis enable us to obtain a deeper understanding of the underlying graph distribution, once the grammar has been learned.

Keywords/Search Tags:

Graph, Parameter learning, Data, Applications, Distribution, Underlying, Mining

Related items

1	Application Research On Label Null Model And Subgraph Distribution Algorithm
2	Research On The Frequent Graph Mining Algorithm For Graph Classification
3	Itemset Distribution Mining And Its Applications In Pattern Analysis
4	Research And Application Of Data Mining Based On Graph Structure
5	Learning from partially labeled data: Unsupervised and semi-supervised learning on graphs and learning with distribution shifting
6	The Research On Enhancements Of Deep Convolutional Neural Networks
7	Application Research On Graph Mining Based On Structure And Attribute
8	Research And Applications Of Data Mining
9	The Research On Police Crime Analysis Using Data-mining Technology
10	Large-scale Graph Mining Based On Deep Learning