Graph-based data analysis: Tree-structured covariance estimation, prediction by regularized kernel estimation and aggregate database query processing for probabilistic inference

Posted on:2009-09-09

Degree:Ph.D

Type:Dissertation

University:The University of Wisconsin - Madison

Candidate:Bravo, Hector Corrada

Full Text:PDF

GTID:1448390005952807

Subject:Statistics

Abstract/Summary:

PDF Full Text Request

This dissertation presents a collection of computational techniques for the analysis of data where relationships between objects can be expressed through a graph. Data of this type can be found in many and diverse settings, including genomic and epidemiological applications, web search, social networking and decision making. Although taking relationships into account makes analysis of this type of data more challenging, the graph structure of these relationships can be used to make this analysis viable. In this dissertation, we implement a number of techniques for analyzing this type of data using well-known and tested computational tools. Furthermore, we explore these techniques over a wide array of biological and decision making applications.;In Part I, we present a method for estimating tree-structured covariance matrices directly from observed continuous data. Tree-structured covariance matrices encode probabilistic relationships between objects that can be described by rooted trees. In this case, we directly estimate graph structure from observed data under a specific probabilistic model.;Part II presents a methodology for graph-based prediction where a predictive model is estimated over data where relationships between objects are encoded by a known graph. We make extensive use of Regularized Kernel Estimation (Lu et al., 2005), a framework for estimating a positive semidefinite kernel from noisy, incomplete and inconsistent distance data. In this case, the graph structure of the data is used to define a distance from which a kernel matrix is estimated.;Finally, in Part III, we present techniques for efficiently evaluating aggregate queries of a particular type over views defining a large number of database records. The main assumption is that this view is the result of a stylized join over a number of much smaller tables, and is described by a graph. We make use of this graph structure to reduce the cost of single query evaluation and to cache intermediate results in a query workload setting. This framework was designed in part to address scalable probabilistic inference in relational databases.

Keywords/Search Tags:

Data, Probabilistic, Relationships between objects, Tree-structured covariance, Graph, Query, Kernel, Estimation

PDF Full Text Request

Related items

1	An Improved Probabilistic Database Model And Its Probabilisticn Earest Neighbors Query Research
2	Research On The Methods Of Uncertainty Data Indexing And Querying In Mobile Environments
3	Research On Query Processing On XML Data
4	The Study Of The Probabilistic Query Processing Techniques And The Analysis Of The Reachable Region For Moving Objects Located In A Constrained Space
5	Research On XML Data Theory Based On Probability
6	The Research On Structured Query Generation Framework Based On Semantic Query Graph
7	Research On Moving Objects And Its Nearest Neighbor Query Algorithms
8	Research On Query Processing Of Graph-Structured XML Data
9	Functional data analysis of populations of tree -structured objects
10	A Querying Method Over Moving Objects In Probabilistic Database