Font Size: a A A

Data Efficient and Robust Algorithms for Reconstructing Large Graphs

Posted on:2015-02-23Degree:Ph.DType:Dissertation
University:The University of Wisconsin - MadisonCandidate:Dasarathy, GautamFull Text:PDF
GTID:1478390017499921Subject:Engineering
Abstract/Summary:
Many modern signal and information processing tasks center around the discovery of (significant) interactions in a large, complex system from measurements made about it. This can be thought of as reconstructing a graph, that is, the discovery of the (significant) edges of an appropriate graph from data. In this dissertation, we devise provably robust and efficient algorithms for various graph reconstruction problems.;The first problem we consider is the reconstruction of the (router-level) topology of computer networks. We demonstrate a family of algorithms that leverages the structure present in the underlying network to recover the routing topology more accurately and with far fewer probes than previous techniques.;We next show that similar techniques can be brought to bear on the more general problem of hierarchical clustering based on pairwise similarities. In particular, we exhibit algorithms for reliable recovery of the hierarchical clustering that are (a) near optimal in terms of the number of pairwise similarities needed, and (b) robust in terms of the regularity assumptions on the pairwise similarity values.;The third problem concerns the recovery of more general graph structures and is motivated by an application we call covariance sketching: the recovery of the covariance graph (i.e., the graph indicating the dependence among variables) from compressed samples of the underlying random process. We show that under certain assumptions, this can be done efficiently and robustly using a simple convex program. The theoretical framework we develop has direct implications for compressed acquisition of multi-dimensional signals as well.;Finally, we consider an important problem from quantitative biology -- phylogenetic inference from molecular data corresponding to several genes. It is known that the evolutionary history of individual genes might be topologically distinct from each other and from the underlying species tree, possibly confounding phylogenetic analysis. A further complication in practice is that one has to estimate gene trees from molecular sequences of finite length. To address this, we devise a novel algorithm that explicitly takes into account gene tree estimation errors and provably improves over all previous methods in a regime of interest.
Keywords/Search Tags:Graph, Algorithms, Data, Robust
Related items