Font Size: a A A

Manifold Learning On Probabilistic Graphical Models

Posted on:2011-11-01Degree:MasterType:Thesis
Country:ChinaCandidate:Y L ShaoFull Text:PDF
GTID:2178360302974674Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Modeling training data is a fundamental problem in machine learning. In this thesis, we put together the two most powerful data modeling techniques, namely manifold learning and statistical modeling, so that the combined method will benefit from the advantages of both approaches. Based on our relevant previous works, this thesis proposed a theoretical framework in terms of backgrounds on Riemannian manifold. We designed two different algorithms to do the optimization and inference, each with different performance guarantees. Beside of that, we also give in depth analysis of the algorithms, including convergence property, convexity, and computational complexity, which greatly enlarged the extent of our approach and broadened its application range. To accelerate the whole process, we also created a general purpose statistical inference engine, named YASIE (Yet Another Statistical Inference Engine), so that different models can be composed by using building blocks, and the computational cost are carefully refined so that it approaches hand written inference codes. Based on these methodologies and tools, we show our experimental results in two classic machine learning problems. Our approach is particularly good for semi-supervised learning where most of the training data instances are not labeled. Related papers are published in top conferences and journals such as ACM Multimedia and IEEE TKDE.Manifold learning assumes that the intrinsic dimensionality is much smaller than the apparent dimensionality of the data. Possible data samples will lie on a low dimensional manifold embedded in the high dimensional ambient space. The task of manifold learning is to get some knowledge of the true structure of the manifold by using the finite number of data samples at hand, so that we can compute and approach the true geometrical properties of the manifold, such as low dimensional embedding, tangent space, Laplace-Beltrami operators, etc. Current manifold learning methods usually creates adjacency or similarity graphs among the data points, from which an objective function for the optimization of the labels of the data points is induced. The benefit of manifold learning is that it is highly non-parametric, it models non-trivial structures among data effectively and accurately, and sometimes the discretely computed results can even be proved to converge to the continuous case. The problem is that it is difficult to deal with multi-modal data, such as those in image annotation. Besides, it is difficult to introduce prior knowledge and deal with dynamic data in an online setting using manifold learning. Contrarily, statistical methods models data with properly factorized joint distribution. Thanks to the long term accumulation, statistical methods have good solutions for the above mentioned problems. However, the problem of statistical modeling is that it is usually highly parametric. Whether these methods fits data well depends on how well we specify the parametric models. Most computationally tractable and effective models have difficulties dealing with data in non-trivial manifolds.In this thesis, we combine the two approaches in two different manner. One is to append the objective function of statistical learning with a new constraint induced by the adjancy graph for approaching the manifold structure. The added constraint is regarded as a regularization term. Most matured methodologies in this thesis are based on this principle. Another possible direction is to make better use of the probabilistic graphical model, to model the adjacency graph directly, so that the probabilistic dependency relationship can be expressed straightforwardly in the chain graph model. And we can prove that: (i) Some manifold regularization terms can be reinterpreted by a specific form of chain graph representation; (ii) Some chain graph representation of the adjacency graph can be reinterpreted by using manifold regularization. This part of the work is still in exploration.
Keywords/Search Tags:Manifold Learning, Statistical Learning, Manifold Regularization, Probabilistic Graphical Models, Semi-Supervised Learning
PDF Full Text Request
Related items