Research And Application On Unsupervised Feature Reduction

Posted on:2009-02-11

Degree:Master

Type:Thesis

Country:China

Candidate:P Zhang

Full Text:PDF

GTID:2178360272485710

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

In many areas of machine learning, pattern recognition, information retrieval and bioinformatics, one is often confronted with the massive high-dimensional dataset, which leads to the curse of dimensionality. The computational complexity of learning machines in high-dimensional feature space is very expensive. In addition, noisy features will reduce the performance of learning algorithms. To solve these problems, feature reduction maps the original feature space into a low dimensional space, in which the important information for the following learning tasks is preserved. Feature reduction can be broadly divided into two categories: feature extraction and feature selection. Feature extraction tries to obtain a linear or nonlinear combination of original feature set and decorrelate the dependency among features. Feature selection tries to identify and depress the features that are not discriminative to the real classes. In the unsupervised background, due to the absence of class information, feature reduction, especially for feature selection, is a real challenge.Manifold learning is an important branch of feature extraction. In this dissertation, we propose a novel manifold learning method, called Locally Linear Inlaying (LLI). The basic assumption of manifold learning algorithms is that the input data lie on or close to a low-dimensional nonlinear manifold. By adopting divide-and-conquer strategy, LLI first embed various local linear areas and then inlay them globally. LLI greatly improve the time complexity and robustness of manifold learning algorithms. Firstly, its time complexity is linear in the number of data points; Secondly, LLI overcomes problems caused by the non-uniform sample distribution.Thirdly, LLI is robust to both homogeneous and heterogeneous noise. We demonstrate the efficiency and effectiveness of LLI by synthetic and real-world face datasets. As for feature selection, in the original feature set, there are a large number of noisy features, which will seriously disturb the reasonable distance metric (or relevant metric). Most existing feature selection methods lack metric invariance and hence are susceptible to strongly irrelevant distance metrics. In this dissertation, we propose a metric invariant approach to dealing with irrelevant distance metrics. The important observation is that, if a statistic guiding unsupervised feature selection is invariable under possible metric scaling, the solution of the feature selection model will be invariable; hence, if this model can work on a relevant feature space, it will still work on any irrelevant feature space transformed from the relevant one by metric scaling. Theoretical justification of the invariance of our model is demonstrated. Experiments on synthetic and real-world text datasets are also promising.

Keywords/Search Tags:

Unsupervised Learning, Feature Reduction, Feature Extraction, Feature Selection, Manifold Learning, Locally Linear Inlaying, Metric Invariance

PDF Full Text Request

Related items

1	The Study Of The Manifold Learning Based Feature Extraction Methods And Their Applications
2	Manifold-based Feature Extraction And Face Recognition Analysis
3	Research On Manifold Learning And Its Applications
4	Manifold Learning And Semi-supervised Learning With Applications To Feature Extraction
5	Research Of Dimension Reduction Algorithm And Its Application
6	Structure Preserving Unsupervised Feature Selection Method Based On Autoencoder
7	Research On Metric Learning Method Based On Manifold Hypothesis
8	The Research Of Feature Extraction In Face Recognition
9	Research On Unsupervised Feature Selection Based On Low-Rank Constraint And Graph Embedding
10	Research On Dimensionality Reduction And Its Application In Feature Extraction