Font Size: a A A

Approach for mining multiple dependence structure with pattern recognition applications

Posted on:2004-08-09Degree:Ph.DType:Thesis
University:Chinese University of Hong Kong (People's Republic of China)Candidate:Liu, ZhiyongFull Text:PDF
GTID:2468390011966723Subject:Computer Science
Abstract/Summary:
Multiple dependence structure mining is a basic task for many fields, such as the data mining and pattern recognition. In this thesis we investigate, theoretically or empirically, several typical approaches for the data dependence structure mining, with the applications on both feature extraction and shape detection in image, two basic topics in pattern recognition.; We start by the local PCA model, which considers the dependence structure up to second order statistics. Typically, we apply the local PCA model on two interesting application cases. The first one is the astronomy object detection and classification, via the Bayesian Ying-Yang (BYY) normalization learning based local PCA algorithm, as both the feature extraction and clustering tools. The second application is the strip line detection and thinning in image by the rival penalized competitive learning (RPCL) based local PCA algorithm, via constraining each local subspace with a line-shape structure.; To describe the dependence structure concerning higher statistics and even with noise extension, we then proceed to discuss the ICA mixture and NFA mixture model for the feature extraction. The ICA mixture, which aims at the multiple independence structure mining, is superior to local PCA in the sense that the ICA can take advantage of the higher-order statistics of the samples, and meantime, the NFA mixture further improves the ICA mixture in that it relaxes the impractical noise-free assumption for the ICA. We not only applied the two models as the feature extraction tool in the star/galaxy classification system, but more importantly, (1) for the ICA mixture, under certain weak assumptions we proved a fundamentally important issue for the ICA model, i.e., the so-called one-bit-matching conjecture which states that “all the sources can be separated as long as there is a one-to-one same-sign-correspondence between the kurtosis signs of all source probability density functions (pdf's) and the kurtosis signs of all model pdf's”. (2) For the NFA mixture model, we focus on a key yet analytically intractable step—the factor estimating step for the algorithm.; Furthermore, for the non-liner shape dependence structure, we proceed to discuss the multisets mixture learning (MML) based shape detection in image. The MML provides a general method for shape detection in image by minimizing the mean square error (MSE) reconstruction error and the shapes to be detected for the MML can be roughly classified into two categories. The first category includes the shapes that can be mathematically formulized and the second category includes those represented by a pre-set template. For the first category which by nature needs specific algorithms for different shapes, we develop a MML based algorithm for detecting ellipse in image. For the second category whose template is represented by a set of contour points, we develop an efficient line segment approximation (LSA) algorithm to calculate the sample reconstruction error, which needs not enumerate the distances between the sample and all the contour points. (Abstract shortened by UMI.)...
Keywords/Search Tags:Dependence structure, ICA, Pattern recognition, Mining, Local PCA, NFA mixture, Feature extraction, MML
Related items