Font Size: a A A

Topic Comparative Study Between Microblog And Tratidional Media Based On LDA

Posted on:2014-02-12Degree:MasterType:Thesis
Country:ChinaCandidate:Z Y ZhouFull Text:PDF
GTID:2248330392960919Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
As science and technology develop fast and the Internet spreads widely around the world,people are getting messages from website, forum, weblog etc. instead of traditional media,such as newspaper, magazine, television, broadcasting etc. Especially with the Web2.0coming near, immediate and social media is developing fast like microblog. It has become atrend that people get messages from microblog quickly. Our task is to make certain thediscrepancy between tradition media and microblog, the common topics and specific topicsof events, the differences of the trend on content and attention factor of the same topics, andexpressing differences of the same topics. We propose a comparative study betweenmicroblog and traditional media based on LDA, and integrate the text features of traditionalmedia and microblog with the help of statistical probability model.In this paper, LDA model is used for modeling the corpus of both media about specificevents dispersed by time first, and extracting semantic information of topics. Then, wecalculate the Attention Factor (AF) to know the attention scale of each topic on both media,what topics get higher AF and what are specific on each medium. After that, we useJenson-Shannon divergence to calculate the Evolution Factor (EF) to get an evolution pathof a topic, and analyze the topics in the path’s trends on both content and AF. Finally, wecalculate the Diversity Factor (DF) between the media, to realize the discrepancy of wordsbetween them, using a method based on common words to judge the same topic on bothmedia.The experiments show that, the critical topics get higher AF, about0.18, and last onmicroblog, while news reports not, about0.13. Their EF values are lower, about0.7, butreversely on news reports, about0.78, and they differ greatly between both media, DF valueis about0.6. The factual topics get higher AF, about0.2, on news reports and lasts while onmicroblog not, about0.15. They perform the same of long lasting and big variations on bothmedia, EF values are about0.75, but differ slightly on both media, DF values are about0.45.
Keywords/Search Tags:Topic Model, Microblog, News Reports, Compare
PDF Full Text Request
Related items