Application Of Locally Linear Embedding In Text Classification

Posted on:2008-09-23

Degree:Master

Type:Thesis

Country:China

Candidate:C E Li

Full Text:PDF

GTID:2178360245478296

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

The data of real world is usually of high-dimensional data, which is difficulty to understand, present and process for its high dimensions. So it is faced with two puzzles. The first one is the curses of dimensionality which has challenged the pattern recognition and discovering formulas on high-dimensional data. The second is the blessings of dimensionality which shows that the abundance information of the high-dimensional data set means the new feasibility. How to express the high-dimensional data in the low-dimensional space and discover the intrinsic structure is the pivotal problem of high-dimensional information processing.Text classification is facing the same problem, there are thousands of features; even more than the number of documents. However, it's very difficult to evaluate the statistical aracteristics of samples because of the high dimensions. It will lead to "over study" and reduce classifiers' performance. So that how to select features that represent the documents well is quite necessary. Effective dimensionality reduction could make the earning task more efficient and more accurate in text classification.In this paper,the procedure of the locally linear embedding(LLE) algorithm is studied and applied in the text classification.Texts have been represented to vector by the vector space model(VSM).After feature reduction,we get data set in lower dimension,then reduce dimensionality and get much lower dimension by LLE.Have trained the classifying machine by training sample,and testing sample has been classified on the classifying machine.Classifying machine based on the support vector machine is chosen. it does not require an iterative algorithm, and just a few parameters need to be set, what's more, it perform very well on high-dimensional data of face data sets. However, the algorithm is sensitive to two parameters that should be set artificially, which is seldom researched., especially to get reliable estimators of embedding dimension still remains as a open problem. So in this paper,the result in the different number of neighbor points and intrinsic dimensionality has been compared to get the best condition.

Keywords/Search Tags:

locally linear embedding, text classification, intrinsic dimensionality, vector space model

PDF Full Text Request

Related items

1	Manifold Learning And Applications Of Locally Linear Embedding Algorithm
2	Study Of Locally Linear Embedding To Outlier Detection In High Dimensional Space
3	Improvement Of Locally Linear Embedding Algorithm And Its Application In Face Recognition
4	Research On The Manifold Based Locally Dimensionality Reduction Algorithms
5	Locally Linear Embedding Based On Manifold Learning And The Applications In Face Recognition
6	Research On Dimensionality Reduction Algorithms Based On Locally Linear Embedding
7	Geometric Perspective On Linear Dimensionality Reduction Algorithms
8	Research On Dimensionality Reduction Algorithms Based Locally Linear Analysis
9	The Improvement And Research Of Locally Linear Embedding Algorithm
10	Study Of Data Reduction Technique Based On Manifold Learning