Font Size: a A A

Contributions to statistical learning and statistical quantification in nanomaterials

Posted on:2010-01-09Degree:Ph.DType:Dissertation
University:Georgia Institute of TechnologyCandidate:Deng, XinweiFull Text:PDF
GTID:1448390002976277Subject:Statistics
Abstract/Summary:PDF Full Text Request
The research topic in chapter one is covariance matrix estimation for a large number of Gaussian random variables, which is a challenging yet increasingly common problem. A fact neglected in practice is that the random variables are frequently observed with certain temporal or spatial structures. Such a problem arises naturally in many practical situations with time series and images as the most popular and important examples. Effectively accounting for such structures not only results in more accurate estimation but also leads to models that are more interpretable. In this chapter, we propose shrinkage estimators of the covariance matrix specifically to address this issue. The proposed methods exploit sparsity in the inverse covariance matrix in a systematic fashion so that the estimate conforms with models of Markov structure and is amenable for subsequent stochastic modeling. The present approach complements the existing work in this direction that deals exclusively with temporal orders and provides a more general and flexible alternative to explore potential Markov properties. It is shown that the estimation procedure can be formulated as a semi-definite program and efficiently computed. The merits of these methods are illustrated through simulation and the analysis of a real data example.;Extending the classical principal component analysis (PCA), the kernel PCA (Scholkopf, Smola and Muller, 1998) effectively extracts nonlinear structures of high dimensional data. As in PCA, the kernel PCA can be sensitive to outliers. Various approaches have been proposed in the literature to robustify the classical PCA. However, it is not immediately clear how these approaches can be "kernelized" in practice. In the second chapter, we propose a robust kernel PCA procedure. We show that the proposed method can be easily computed. Simulations and a real example in the financial service also demonstrate the competitive performance of the proposed approach when there are outlying observations.;The third chapter deals with active learning via sequential design. Motivated by a problem in detecting money laundering accounts, we propose an active learning method using Bayesian sequential designs. The method uses a combination of stochastic approximation and D-optimal designs to judiciously select the accounts for investigation. The sequential nature of the method helps to identify the suspicious accounts with minimal time and effort. An application to real banking data is used to demonstrate the performance of the method. A simulation study shows the efficiency and accuracy of the proposed method, as well as its robustness to model assumptions.;The factor logit-models with a large number of categories are developed in chapter four. We study the theoretical properties of the estimated classifier functions. It is worth noting that when the number of categories is relatively large, the classifier functions are likely to be located in a functional subspace with much smaller dimensions than the number of categories. Therefore, we propose a factor model for the classifier functions. We show that the convergence rate of the classifier functions estimated from the factor model does not rely on the number of categories, but only on the number of factors. The proposed method therefore can achieve better classification accuracy.;In chapter five, a statistical approach is presented to quantifying the elastic deformation of nanomaterials. Quantifying the mechanical properties of nanomaterials is challenged by its small size, difficulty of manipulation, lack of reliable measurement techniques, and grossly varying measurement conditions and environment. A recently proposed approach is to estimate the elastic modulus from a force-deflection physical model (simply-supported beam model) based on the continuous bridged-deformation of a nanobelt using an Atomic Force Microscope tip under different contact forces. However, the nanobelt may have some initial bending, surface roughness and imperfect physical boundary conditions during measurement, leading to large systematic errors and uncertainty in data quantification. We propose a new statistical modeling technique, called sequential profile adjustment by regression (SPAR), to account for and eliminate the various experimental errors and artifacts. SPAR can automatically detect and remove the systematic errors and therefore gives more precise estimation of the elastic modulus. This work presents an innovative approach that can potentially have a broad impact in quantitative nanomechanics and nanoelectronics. (Abstract shortened by UMI.)...
Keywords/Search Tags:Kernel PCA, Covariance matrix, Statistical, Chapter, Approach, Classifier functions, Large, Estimation
PDF Full Text Request
Related items