Font Size: a A A

Learning binary functions: Combinatorial prognosis and diagnosis

Posted on:2005-01-30Degree:Ph.DType:Dissertation
University:Rutgers The State University of New Jersey - New BrunswickCandidate:Alexe, SorinFull Text:PDF
GTID:1458390008478514Subject:Operations Research
Abstract/Summary:
This paper reports on the recent theoretical and algorithmic contributions to the development of the Logical Analysis of Data ( LAD) method, as well as on the LAD applications to prognosis and diagnosis, with a special focus on medical problems. The theoretical framework of Boolean functions is generalized to that of binary functions. Several extensions of the most common concepts are presented, including (prime) implicants, DNFs—disjunctive normal forms—consensus method and special classes, and they are used to characterize the learning process within the LAD method: given a partially-defined binary function (a dataset, or an archive of observations), learn how to predict its values on new unseen yet observations, i.e., find an extension of it preserving the structural properties (information) of the original function. The algorithmic section presents a total polynomial time procedure for the evaluation of the prevalences of (number of points contained in) each possible subinterval of the original observation space, and it shows that the algorithm can be efficiently used for finding logical patterns (or rules) in datasets. The uses of patterns for classification and risk stratification are presented in subsequent sections. The LAD method has some similarities with other data mining techniques, including classification trees, artificial neural networks, support vector machines. However, there are significant differences both in the techniques used to construct the classifiers and in the types of results that might be obtained, and they are discussed in the classification section. The quality of the LAD classifiers is compared, using a benchmark of several publicly available datasets, with that of the other classifiers presented in the literature, and it is shown that LAD classifiers have consistently top accuracies. Finally, this combinatorial data analysis method is applied to a large medical dataset to stratify the risk of death among cardiac patients, and its results, found to be of high interest by field experts, are presented. Moreover, a comparative study between the LAD results and those provided by Cox regression model, widely used by the cardiology research community, shows small, however important advantages for the LAD model.
Keywords/Search Tags:LAD, Binary, Functions, Used
Related items