Font Size: a A A

Classification in the presence of missing covariates

Posted on:2008-04-16Degree:Ph.DType:Thesis
University:Carleton University (Canada)Candidate:Montazeri-Najafabadi, ZahraFull Text:PDF
GTID:2448390005956677Subject:Statistics
Abstract/Summary:
The purpose of this thesis is to study the problem of classification in the presence of missing covariates and to propose methods for constructing consistent classifiers under various missing patterns. We derive representations for the best classifier when some of the covariates can be missing; this is done without imposing any assumptions on the underlying missing probability mechanism. Furthermore, without assuming any MAR-type conditions, we also construct consistent classifiers that do not require any imputation-based techniques. When the MAR assumption holds, we employ kernel-based imputation and Horvitz-Thompson-type inverse weighting approaches to handle the presence of missing covariates. The validity of our resulting classifiers is assessed via both theory and numerical examples. The thesis is organized as follows: Chapter 1 contains the basic definitions and concepts which will be needed throughout this thesis. Here, we introduce some specific notation while discussing a few preliminary results and a brief description about the literature of the problem which is to be discussed in this thesis. Chapter 2 gives two representations of the functional form of the optimal classifier when a block of covariates is missing; it also proposes consistent parametric and nonparametric classifiers. In Chapter 3 we introduce the Swiss-cheese model where missing values can be anywhere among the covariates. In this chapter we derive the best classifier and construct consistent classifiers under various conditions. Chapter 4 applies the results from Chapter 2 and 3 to find consistent classifiers based on histogram rules as well as the general partitioning rules. In Chapter 5 the least-squares approach is used to perform nonparametric classification in the presence of missing covariates. In this chapter both kernel-based imputation as well as Horvitz-Thompson-type inverse weighting approaches are employed to handle the presence of missing covariates. Using the theory of empirical processes, the performance of the resulting classifiers is assessed by obtaining exponential bounds on the deviations of their conditional misclassification errors from that of the best classifier. Chapter 6 contains some simulation studies to illustrate the proposed methods. In this chapter, we consider artificial examples as well as real data sets. Finally, Chapter 7 suggests some possible future studies.
Keywords/Search Tags:Missing, Presence, Chapter, Classification, Consistent classifiers, Thesis
Related items