Research On Web Text Categorization Based On Latent Semantic Analysis

Posted on:2008-07-04

Degree:Master

Type:Thesis

Country:China

Candidate:J F Wang

Full Text:PDF

GTID:2178360212980810

Subject:Communication and Information System

Abstract/Summary:

This paper proposed a method to Web text categorization based on latent semantic analysis, it thinks that some context relations exist between terms, between terms and documents, and a semantic structure can be consist of respective relation between many documents and terms. The semantic structure is computed and deal with the structure to keep the most main relation between documents and terms and eliminates else huge, redundant, minor factor. The structure optimized is not only smarter than the original structure, but also keeps the most main relation, is easier to deal with the high dimensionality characteristic of the text document based on VSM, so it can mine the latent semantic relation. In sequent search, the latent similarity is computed between documents and improves the effective on the performance of the Web text categorization.

Keywords/Search Tags:

Latent semantic analysis, Web text categorization, Vector Space Model, Singular Value Decomposition, Local feature space

Related items

1	The Implementation And Research Of The Probabilistic Latent Semantic Analysis Model In The Search Engine's Business Text Classification System
2	Research On Text Classification Based On Ontology And Latent Semantic Indexing Algorithm
3	Research On Support Vector Machines Classification Algorithm In Text Categorization
4	Research Of Text Categorization Based On Vector Space Model
5	Research And Application Of Automatic Classification Method For Work Tickets
6	Research Of Text Categorization Base On Vector Space Model And Association Rules
7	Latent Semantic Analysis Based On Multi-system Combination
8	Research On Feature Selection Method Based On Text Category Relevance Degree And Latent Semantic Analysis
9	Chinese Text Clustering Based On Latent Semantic And Its Applications
10	The Research And Implementation Of Chinese Text Categorization