Font Size: a A A

Data mining relational databases with probabilistic relational models

Posted on:2007-11-24Degree:M.ScType:Thesis
University:McGill University (Canada)Candidate:Chen, YuFull Text:PDF
GTID:2458390005487476Subject:Computer Science
Abstract/Summary:
Relational databases are a popular method for organizing and storing data. Unfortunately, many machine-learning techniques are unable to handle complex relational models. The Probabilistic Relational Model (PRM) is an extension of the Bayesian Network framework that can express relational structure as well as probabilistic dependencies. In this thesis, we significantly expand and improve an implementation of PRMs that allows defining conditional probability distributions over discrete and continuous variables. The thesis uses as starting point an implementation that has various problems, and runs very slowly when using a database management system (DBMS) as storage. This thesis discusses alternative algorithms that improve the accuracy of the learned models, the computing performance, and correct the inference problems of the existing implementation. The focus is on techniques used to reduce the running time of the algorithms when the implementation is used to learn from data stored on a DBMS. The thesis provides experimental results using this package on both synthetic and real data sets.
Keywords/Search Tags:Data, Relational, Probabilistic, Thesis
Related items