An Application Research Of LDA Model On Text Classification

Posted on:2017-12-27

Degree:Master

Type:Thesis

Country:China

Candidate:S Y Cong

Full Text:PDF

GTID:2428330548483812

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

The high dimension,sparseness and ambiguity of texts is the main factors which lead to can not obtain excellent performance for text classification.The traditional feature selection algorithm that is based on the assumption of term condition independence is able to solve the first two problems,but it does ignore the semantic information.In order to take the latent semantic into consideration,the LDA topic model is used for feature extraction.However,the LDA method does not take the input space into consideration effectively,when making topic label to each word in the original space,it holds the non-action words,which affects the probability distribution of the topic extremely.To overcome this insufficiency,a new LSI_LDA is proposed in this paper.The LSI maps the input space to the low dimensional space and filters the non-action words firstly,which makes LDA perform topic label in a simpler and clearer space,so that it can achieve a more precise topic distribution and improve the modeling capabilities.The idea of this kind of pre-filter is as the following : the traditional feature selection algorithm based on conditional independence and mRMR extracts a subset of the original features and does not change the interpretation of the original semantics.LSI with singular value decomposition technology maps the original space into a lower dimensional space,then generate new feature relationship in terms of the latent semantic.It seeks the features which have the best representation to the documents.

Keywords/Search Tags:

text classification, feature extraction, mRMR, LSI, LDA

PDF Full Text Request

Related items

1	Research On Feature Selection Algorithm For Text
2	Research On Feature Extraction And Classification Algorithm In Text Categorization
3	Design And Implementation Of Text Classification Model Based On The Improved TF-IDF Feature Extraction
4	Chinese Text Feature Extraction And Classification Based On The Semantics Association
5	Research On Topic Feature Extraction And Text Classification In Social Internet Community
6	Research And Application Of Talent Job Online Matching Based On Text Feature Extraction Technology
7	A Research On Feature Extraction Applied For Text Classification
8	Research On Some Problems In Text Classification
9	Research On Text Macro Feature Extraction And Centroid-based Automatic Classification Methods
10	Studies On Key Techniques Of Text Classification And Mining For Specific Domains