Font Size: a A A

A Research Of Latent Dirichlet Allocation Model Based On Improved Variational Inference

Posted on:2022-12-05Degree:MasterType:Thesis
Country:ChinaCandidate:J HuangFull Text:PDF
GTID:2518306764968449Subject:Automation Technology
Abstract/Summary:PDF Full Text Request
Topic model is a statistical method used to find latent topics in text.Topic model,represented by latent Dirichlet allocation(LDA)can automatically organize and summa-rize text information,explain the potential semantics of documents,and analyze the top-ics contained in massive information.Variational inference is a learning and reasoning method with implicit variable model,mainly used to approximate a posterior distribution in Bayesian model,while LDA model is exactly a topic model based on Bayesian learn-ing,which can be used to calculate the topic distribution of each document well.This paper takes the LDA model based on variational inference as the research topic,and fur-ther study the improvement of variational distribution and Kullback-leibler of variational inference.Firstly,Variational distribution is traditionally defined as mean field form to sim-plify the model,so that each variable of the multidimensional distribution is independent of each other,which is indeed simpler in high-dimensional case.However,in practical application,some variables are correlated to some extent.This assumption destroys the accuracy of variational inference to some extent in the case of a small number of dimen-sions.Based on this,this paper cancels the independence between variables and improves the model to strengthen the dependence between variables.Secondly,for the lower bound of evidence,KL-divergence is traditionally used to de-fine the similarity between two distributions,and then the expression of the lower bound of evidence is derived.Instead of only using KL-divergence to define the similarity be-tween distributions,a new optimization goal is obtained by combining?~2-divergence and KL-divergence.By using this metric method,the algorithm has the advantages of both?~2-divergence and KL-divergence.Finally,this paper uses two improved variational inference methods to estimate the parameters of LDA model respectively,and obtain the topic distribution of the text.Then we compare the application of the improved variational inference method with the tradi-tional variational inference method in LDA model.By calculating the perplexity of the model,we know that the effect of the improved model is better than that of the traditional model.
Keywords/Search Tags:Variational inference, LDA model, KL-divergence, ?~2- divergence
PDF Full Text Request
Related items