Font Size: a A A

View learning: A statistical relational approach to mining biomedical databases

Posted on:2008-09-24Degree:Ph.DType:Dissertation
University:The University of Wisconsin - MadisonCandidate:Davis, Jesse JonFull Text:PDF
GTID:1448390005966955Subject:Artificial Intelligence
Abstract/Summary:
This dissertation develops and evaluates statistical relational learning (SRL) algorithms that can automatically alter the schema of a database by learning new field and table definitions. The algorithmic advances made in this dissertation are motivated by important problems such as providing decision support system for radiologists who read mammograms, predicting three-dimensional Quantitative Structure-Activity Relationships for drug design, and performing entity resolution---the task of recognizing when two molecules, patients, biological pathways, etc. are actually the same.;This dissertation introduces view learning, the ability to automatically alter the schema of a database through the addition of new fields or tables, for SRL, and presents two algorithms for augmenting the schema of a database by adding new fields. It then extends view learning by developing an algorithm for performing predicate invention. We demonstrate the utility of view learning for SRL in two ways. First, it learns significantly more accurate models on a wide variety of domains. Second, it uncovers important and useful knowledge in these domains. For example, it identified a novel feature from a mammography report that is indicative of malignancy.;Motivated by the preceding work, the last part of the dissertation investigates the relationship between receiver operator characteristic (ROC) space and precision-recall (PR) space. Among other contributions, this part proves that for a fixed number of positive and negative examples, one curve dominates another curve in ROC space if and only if the first curve dominates the second curve in PR space. This result implies the existence of an analog to the convex hull for PR space, which we call the achievable PR curve, and it provides an efficient algorithm for constructing the achievable curve.
Keywords/Search Tags:View learning, Database, Curve, SRL, Space, Dissertation
Related items