Font Size: a A A

Studying And Improving The Online Topic Evolution Model Based On LDA

Posted on:2013-03-08Degree:MasterType:Thesis
Country:ChinaCandidate:Y Q XiaFull Text:PDF
GTID:2268330422474162Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
With prosperous growth of Internet, Internet public opinions have influencedpublic opinions deeply. To Real-time supervise Internet public opinions and effectivelylead them have became a hot researching field.In this paper, an Online Topic Evolution Model based on LDA and Bi-PathEvolution (BPE-OLDA model) is proposed mainly to mine information about hot topicsand their evolution from time-limited data streams. According to characteristics oftime-limited data streams, this paper defines genetic degrees of topic intensity to modelthe influence that history datas impose on generative processes of future datas. So,BPE-OLDA model is a bi-path evolution model based on genetic degrees of topiccontent and topic intensity.In order to improve accuracy of topics mined by BPE-OLDA model, this paperalso proposes a Rectified Online Gibbs Resampling algorithm(Rect-OGRS algorithm)to inference paramaters of BPE-OLDA model. Rect-OGRS algorithm contains twoimbedded algorithms:1)the Rectified Online Gibbs Sampling algorithm(Rect-OGS algorithm). Thisalgorithm rectifys the formula that the Online Gibbs Sampling algorithm addresses toestimate word distributions of topics;2)the Gibbs Resampling algortihm(GRS algorithm). This algorithm scans thecurrent timeslice texts to resample topics which generate words in texts after removingnoise topics.This paper addresses online topic number variety, algorithm similarity, togetherwith some proposed indices as to measure how well the Rect-OGRS algorithm andBPE-OLDA model act in mining datas.There are two data sets, one is political news from Taiwan networks, and the otheris the conference papers from NIPS, for experiments. After anylizing lots of experimetresults, here some conclusion are attained: Rect-OGRS algorithm performs better thanthe Online Gibbs Sampling algorithm whether political news sets or NIPS paper sets,but BPE-OLDA model just exceeds much better than the normal Topic EvolutionModel based on the time-limited political news sets, without the same perfomance onNIPS paper sets.
Keywords/Search Tags:topic evolution model, genetic degrees of topic intensity, genetic degrees of topic content, noise topics time-limited
PDF Full Text Request
Related items