Font Size: a A A

Based On Latent Semantic Indexing, Text Classification And Research In Science And Technology Information Retrieval

Posted on:2010-05-17Degree:MasterType:Thesis
Country:ChinaCandidate:Y PengFull Text:PDF
GTID:2208360278470229Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the developement of technology, science and technology innovation is being taken more seriously, but at present, the lack of reference index of innovation makes it necessary to classify the existing science and technology projects' innovativeness index, which can contribute to improve the quality of project appraisal. the adoption of traditional method based on vector space model cann't meet the demand of recall and precision ratio, therefore, a document classification model based on Latent Semantic Indexing is presented in the paper to improve the recall and precision ratio.This paper based on Latent Semantic Indexing \ Singular Value Decomposition, analyses the usage of Singular Value Decomposition in Actual Classification, and decides to replace the SVD by the Partial Least Squares. Experimentation of the Overall Classification of Latent Semantic Indexing Model indicates that the improved model is whole better in stability and accuracy of classification. But the Overall Classification of Latent Semantic Indexing Model does worse in the classification in rare type of information. So the Local Classification of Latent Semantic Indexing Model is advanced. Meanwhile to reduce the cost of memory space, this paper takes the Semi-Discrete Decomposition Method rather than the Singular Value Decomposition. Experiments show that the Local Classification of Latent Semantic Indexing Model ameliorates the classification in rare type of information, and also increases both the size and the correct of the information. In this paper both standard Chinese corpus information and technology project information are taked as experimental corpus, so the use of Classification Model will be improved.This paper, taking the Latent Semantic Indexing as foundation and comparing the study in both the Overall Classification of Latent Semantic Indexing Model and the Local Classification of Latent Semantic Indexing Model, finds that the Local Classification of Latent Semantic Indexing Model can improve the performance of Text Classification to some extent.
Keywords/Search Tags:Information Retrieval, Text Classification, Latent Semantic Indexing, Singular Value Decomposition, the Partial Least Squares, Semi-Discrete Decomposition Method
PDF Full Text Request
Related items