Supervised learning in dynamic streaming graphs

Posted on:2017-09-08

Degree:Ph.D

Type:Thesis

University:Washington State University

Candidate:Yao, Yibo

Full Text:PDF

GTID:2468390014973027

Subject:Computer Science

Abstract/Summary:

With the emergence of nEtworked data, graph classification has received considerable interest during the past years. Most approaches to graph classification focus on designing effective kernels to compute similarities for static graphs. However, they become computationally intractable in terms of time and space when a graph is presented in an incremental fashion with continuous updates, i.e., insertions of nodes and edges. In this thesis, we examine the problem of classification in large-scale and incrementally changing graphs. We focus on two classification scenarios: the graph transaction scenario, and the one large graph scenario. We propose a unified framework to study this problem. The framework consists of three components: a subgraph extraction process, an online version of an existing fast graph kernel, and two kernel-based classifiers (SVM and Perceptron). To classify nodes in a single large graph, we design an entropy-based subgraph extraction strategy to select informative neighbor nodes and discard those with less discriminative power, to facilitate an effective classification process. By retaining the support vectors from each learning step, the classification models built using the kernel-based classifiers can be incrementally updated whenever new changes are made to the subject graph. We demonstrate the advantages of our learning techniques by conducting an empirical evaluation on several large-scale real-world graph datasets.;On the other hand, detecting concept drift in data streams has been widely studied in the data mining community. Conventional drift detection methods use classifiers' outputs (e.g., classification accuracy, error rate) as indicators to signal concept changes. As a result, their performance greatly depends on the chosen classifiers. However, there is little work on addressing concept drift in graph-structured data. We present an entropy-based method to effectively detect concept drift in graph streams. Contrary to many related works, we investigate the intrinsic properties of data (i.e., subgraph distribution w.r.t. category labels), instead of monitoring classification outputs. Our approach is combined with the proposed incremental graph learning techniques and tested on synthetic and real-world graph data streams. The experimental results demonstrate the advantage of our method in detecting concept drift as well as improving classification performance.

Keywords/Search Tags:

Graph, Classification, Concept drift, Data

Related items

1	Research On Data Stream Classification Algorithm Of Adapting To The Concept-drift
2	Study On Data Streams Online Classification Algorithm Of Adapting To The Concept-Drift
3	Research On Classification Algorithm For Conceptual Drift Data Flow
4	Research On Ensemble Classification Algorithms Of Data Stream Based On Concept Drift
5	Research On Classification Method And Application Of Concept Drift Time Series Data
6	Classification Algorithm For Data Streams With Concept Drift And Its Applications
7	Research On Classification Of Data Stream With Recurring Concept Drift
8	Research On Online Ensemble Classification Algorithm Based On Concept Drift Detection
9	Research On Data Stream Classification Method Based On Concept Drift Detection
10	Research On Data Streams Classification With Concept Drift