Latent factor models for statistical relational learning

Posted on:2011-02-05

Degree:Ph.D

Type:Thesis

University:Hong Kong University of Science and Technology (Hong Kong)

Candidate:Li, Wu-Jun

Full Text:PDF

GTID:2448390002955516

Subject:Computer Science

Abstract/Summary:

To simplify modeling procedures, traditional statistical machine learning methods always assume that the instances are independent and identically distributed (i.i.d.). However, it is not uncommon for some real-world data, such as web pages and research papers, to contain relationships (links) between the instances. Different instances in such data are correlated (linked) with each other, which implies that the common i.i.d. assumption is unreasonable for such relational data. Hence, naively applying traditional statistical learning methods to relational data may lead to misleading conclusions about the data.;Statistical relational learning (SRL), which attempts to perform learning and inference in domains with complex relational structures, has become an emerging research area because relational data widely exist in a large variety of application areas, such as web mining, social network analysis, bioinformatics, economics and marketing. The existing mainstream SRL models extend traditional graphical models, such as Bayesian networks and Markov networks, by eliminating their underlying i.i.d. assumption. Some typical examples of such SRL models include relational Bayesian networks, relational Markov networks, and Markov logic networks. Because the dependency structure in relational data is typically very complex, structure learning for these relational graphical models is often very time-consuming. Hence, it might be impractical to apply these models to large-scale relational data sets.;In this thesis, we propose a series of novel SRL models, called relational factor models (RFMs), by extending traditional latent factor models from i.i.d. domains to relational domains. These proposed RFMs provide a toolbox for different learning settings: some of them are well suited for transductive inference while others can be used for inductive inference; some of them are parametric while others are nonparametric; some of them can be used to model data with undirected relationships while others can be used for data with directed relationships. One promising advantage of our RFMs is that there is no need for time-consuming structure learning and the time complexity of most of them is linear to the number of observed links in the data. This implies that our RFMs can be used to model large-scale data sets. Experimental results show that our models can achieve state-of-the-art performance in many real-world applications such as linked-document classification and social network analysis.

Keywords/Search Tags:

Models, Relational, Statistical, Data, Traditional

Related items

1	Research On Methods Of Learning Statistical Relational Model
2	Research On Context-Based Statistical Relational Learning
3	Learning statistical models from relational data
4	Efficient Learning of Statistical Relational Models
5	Statistical learning from relational databases
6	Research On Migration Algorithm From Traditional Relational Database To Non Relational Database
7	Statistical Methods for Evaluating Relational Structures in Multi-Dimensional Phenotypic Data for Neuropsychiatric Disorders
8	Unsupervised Spatial, Temporal and Relational Models for Social Processes
9	Research On Some Problems Of Statistical Relational Learning
10	Effective Learning of Probabilistic Models for Clinical Predictions from Longitudinal Dat