Font Size: a A A

Locally adaptive techniques for pattern classification

Posted on:2003-03-11Degree:Ph.DType:Dissertation
University:University of California, RiversideCandidate:Domeniconi, CarlottaFull Text:PDF
GTID:1468390011480998Subject:Computer Science
Abstract/Summary:
Pattern classification faces a difficult challenge in finite settings and high dimensional spaces due to the curse of dimensionality. Nearest neighbor methods are especially sensitive to this problem. The need for large neighborhoods in high dimensional spaces is the cause of highly biased estimates. Due to the local nature of feature relevance, any chosen fixed metric violates the assumption of locally constant class posterior probabilities, and therefore fails in making correct predictions in different regions of the input space. In order to achieve accurate predictions, it becomes crucial to be able to estimate the different degrees of relevance that input features may have in various locations of the feature space.; This dissertation introduces novel approaches to computing local feature relevance for nearest neighbor methods that overcome limitations of previous techniques. It makes four specific contributions: (1) ADAMENN algorithm: A novel approach to computing local feature relevance for pattern classification. It uses the Chi-squared distance to design a flexible metric that approximates the theoretical infinite sample risk at the query point. No assumption is made on the probability distribution of data. It is formally shown that the measure of feature relevance derived by ADAMENN reduces the overall mean-squared estimation error. (2) LFM-SVM algorithm : A new local flexible metric technique based on support vector machines. This method overcomes the limitations of lazy learning approaches concerned with scalability and efficiency issues. It is shown that the weighting scheme performed by the LFM-SVM algorithm increases the margin of the solution provided in input by the SVM. (3) GenProClus algorithm: A novel algorithm that computes intra-cluster adaptive metrics for clustering. This algorithm represents an attempt to dodge the curse of dimensionality for clustering, and provides information to what features are relevant for each partition. It is shown that the algorithm converges to a local minimum of the associated error function. (4) AdaBand algorithm: A new locally adaptive technique to set the bandwidth parameters for kernel density estimation. This approach can serve as an efficient approximation of nearest neighbor methods. It is shown how this algorithm can be used to efficiently solve classification, clustering, and range query approximation problems.; The efficacy of the techniques presented is demonstrated through extensive experimental evaluations, using a variety of simulated and real world problems, such as texture recognition in images and letter classification.
Keywords/Search Tags:Classification, Local, Nearest neighbor methods, Adaptive, Feature relevance, Algorithm, Techniques
Related items