Online and active learning of big networks: Theory and algorithms

Posted on:2015-08-18

Degree:Ph.D

Type:Thesis

University:University of Illinois at Urbana-Champaign

Candidate:Gu, Quanquan

Full Text:PDF

GTID:2478390017992529

Subject:Computer Science

Abstract/Summary:

We are living in the Internet Age, in which information entities and objects are interconnected, thereby forming gigantic information networks. These networks are not only massive, but also grow and evolve very quickly. It is critical to quickly process and understand these networks in order to enable data-driven applications. On the other hand, the labels of the nodes in big networks are scarce. It is urgent to optimize the process by which the labels are collected, because it is unrealistic to get labels of every node. The objective of my research is to develop algorithms for big network analytics, which are both statistically and computationally efficient, and with provable guarantee on their performance. In particular, I present active learning, online learning, selective sampling (online active learning), and online learning with bandit feedback algorithms for learning in a network.;In the first part of this thesis, I propose a nonadaptive active learning approach on a graph, based on generalization error bound minimization. In particular, I present a data-dependent error bound for a graph-based learning method, namely learning with local and global consistency (LLGC). I show that the empirical transductive Rademacher complexity of the function class for LLGC provides a natural criterion for active learning. The resulting active learning approach is to select a subset of nodes on a graph such that the empirical transductive Rademacher complexity of LLGC is minimized. I propose a simple yet effective sequential optimization algorithm to solve it.;In the second part of this thesis, I first present an online learning algorithm on a graph for binary node classification. It is an online version of the well-known Learning with Local and Global Consistency method (OLLGC). It is essentially a second-order online learning algorithm, and can be seen as an online ridge regression in the Hilbert space of functions defined on a graph. I prove its regret bound in terms of the structural property (cut size) of a graph. Based on OLLGC, I present a selective sampling algorithm, namely Selective Sampling with Local and Global Consistency (SSLGC), which adaptively queries the label of each node based on the confidence of the linear function on a graph. Its regret bound as well as its label complexity are also derived. I also analyze the low-rank approximation of graph kernels, which enables the online algorithms scale to large graphs.;In the third part of this thesis, I first extend the online binary classification algorithm to multi-class regime, based on spectral learning technique. The resulting algorithm is called online spectral learning on a graph (OSLG). Then I study online learning on a graph with bandit feedback, where after the learner makes a prediction of the node label, the oracle will return a single bit indicating whether the prediction is correct or not. I propose an algorithm namely online spectral learning on a graph with bandit feedback (OSLG Bandit), which uses upper-confidence bound of the nonparametric classifier on a graph to trade off the exploration and exploitation of label information. I derive regret bounds for both algorithms, which clearly characterize the difficulty of online learning on a graph for multi-class node classification, both in the full information setting and in the partial information setting.

Keywords/Search Tags:

Online, Active learning, Networks, Graph, Algorithm, Information, Node, Local and global consistency

Related items

1	Research On Optimization Of Semi-supervised Classification Algorithm Combining With Active Learning
2	Research On Active Contour Segmentation Model Combining Local And Global Information
3	Low-Rank Structure Based Learning Data Representation
4	Two-view Consistency-based Active Learning Algorithm In Image Classification
5	Person Re-Identification Based On Graph Matching And Structure Learning
6	Learning Graph Representations With Global Structural Information
7	Leveraging Explicit Local Semantics For Graph Representation Learning
8	An Online Supervised Learning Algorithm Based On Local Variable Driven STDP For Spiking Neural Networks
9	The Research On Collaboration-training Algorithm And Its Application
10	Saliency Detection Via Jointly Modeling Global Shape And Local Consistency