Font Size: a A A

Document Clustering Method Based On LDA Topic Model

Posted on:2018-02-07Degree:MasterType:Thesis
Country:ChinaCandidate:S S ChenFull Text:PDF
GTID:2348330542965320Subject:Applied statistics
Abstract/Summary:PDF Full Text Request
Text clustering is an important means of text mining and navigation of information organization and methods,morely attention by the researchers.Based on the traditional VSM vector space model can not be semantically understand the inner link between text,as well as the existing in text clustering high sparseness problem,this paper puts forward a kind of LDA(Latent Dirichlet Allocation)theme text clustering method of the model.LDA topic model is a statistical model for text semantic mining potential,to find in the document,the underlying theme Gibbs sampling method is used for parameter reasoning,the text theme is expressed as a fixed set of probability distribution,on the topic space text feature vector.Experiments show that clustering method based on the theme the LDA model has obtained the good effect of text dimension reduction,effectively excavate the potential relationship between semantic information,semantic information,and into the text makes the clustering results more practical.
Keywords/Search Tags:The LDA model, Text clustering, Gibbs sampling
PDF Full Text Request
Related items