Font Size: a A A

Solving a mixed-integer programming formulation of a classification model with misclassification limits

Posted on:2006-11-22Degree:Ph.DType:Dissertation
University:Georgia Institute of TechnologyCandidate:Brooks, J. PaulFull Text:PDF
GTID:1450390008974828Subject:Operations Research
Abstract/Summary:
Classification, the development of rules for the allocation of observations to one or more groups, is a fundamental problem in machine learning and has been applied to many problems in medicine and business. We consider aspects of a classification model developed by Gallagher, Lee, and Patterson that is based on a result by Anderson. The model seeks to maximize the probability of correct G-group classification, subject to limits on misclassification probabilities. The mixed-integer programming formulation of the model is an empirical method for estimating the parameters of an optimal classification rule, which are identified as coefficients of linear functions by Anderson.; The model is shown to be a consistent method for estimating the parameters of the optimal solution to the problem of maximizing the probability of correct classification subject to limits on inter-group misclassification probabilities. A polynomial time algorithm is described for two-group instances. The method is NP -complete for a general number of groups, and an approximation is formulated as a mixed-integer program (MIP). The MIP is difficult to solve due to the formulation of constraints wherein certain variables are equal to the maximum of a set of linear functions. These constraints are conducive to an ill-conditioned coefficient matrix. Methods for generating edges of the conflict graph and conflict hypergraphs are discussed. The conflict graph is employed for finding cuts in a branch-and-bound framework. This technique and others lead to improvement in solution time over industry-standard software on instances generated by real-world data. The classification accuracy of the model in relation to standard classification methods on real-world and simulated data is also noted.
Keywords/Search Tags:Classification, Model, Mixed-integer, Formulation
Related items