Font Size: a A A

Study On The LDA And Improvement Of Its Instability

Posted on:2020-02-13Degree:MasterType:Thesis
Country:ChinaCandidate:C J WangFull Text:PDF
GTID:2428330614465314Subject:Mathematics
Abstract/Summary:PDF Full Text Request
The topic model is able to extract potential topics in the text data and then cluster large-scale document sets based on the topics to which each text belongs.A widely used topic model is Latent Dirichlet allocation.However,LDA suffers from “order effect”,i.e.,different topics are generated if the order of training data is shuffled,and a text may also be divided into different topics.This error can relate to misleading results;specifically,inaccurate topic descriptions and a reduction in the efficiency of text mining classification results.Some scholars have proposed a LDA model based on genetic algorithm,which improves the stability of the model to some extent.However,the LDA model based on genetic algorithm has a relatively slow convergence speed,and is easy to fall into the local optimal solution.What is more,the results of the model are less explanatory.In response to these shortcomings,this paper uses the differential evolution algorithm to optimize the relevant parameters of the LDA model,and the optimized model is called the LDA-DE model.After establishing the LDA-DE model,this paper defines the concept of describing the stability of the model: topic stability,and then uses the topic stability and the accuracy of text clustering as indicators to compare the result of LDA model and LDA-DE model.The results show that the LDA-DE model has higher topic stability and accuracy.Finally,this paper uses the hot news of “315 Consumer Rights Day 2019” as the corpus to establish the LDA model and the LDA-DE model,and to mine the theme with the LDA-DE model which has higher topic stability.By comparing the topics extracted by the LDA-DE model with the content of news in the corpus,the model can be considered to have a good effect.
Keywords/Search Tags:LDA, Topic model, Differential Evolution, Model stability
PDF Full Text Request
Related items