Font Size: a A A

Characterizing Micro-blog Trend’s Aspects With Key-Anchored LDA Topic Model

Posted on:2016-05-09Degree:MasterType:Thesis
Country:ChinaCandidate:A ZhaoFull Text:PDF
GTID:2308330461974077Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the vigorous development of Internet technology, social networking services such as microblog or weibo grew rapidly and amassed an enormous amount of user. Users can focus on personal interests like people or topic, and share their own status anytime and anywhere through the blogging service. The current popular microblog platform tends to be headstream pool of news. Besides, in academic field, microblogging platform has been a research hotspot in research of data mining. Microblog’s topic analysis has great significance which can offer technical support for information navigation and query transaction. For instances, it supplies users the current popular information in real time, and more. It can also help management personnel to know different user goups’ concerns and preferences, thus providing services for related decision-making. Futher more, for government and public opinion workers, it’s convenient to access to public events and trends with data and technical support.On the basis of predecessors’ work, we continue the study of topic detection in microblog, and propose Key-Anchored LDA model for the task of microblog’s topic analysis for characterizing trend’s latent fine-grained aspects. We mainly focus on following problems:extracting the fine-grained diminsions from coarse grained trend, short text’s topic modeling, generating emotional topic using emotion tags, generating user dialogue content topic using posts’ mention tag ’@’.The main content of this paper includes the following points:(1). We propose the task difference from traditional topic detection of latent semantic analysis tasks. Detailedly, given a fixed document set of specific coarse-grained topic, how to extract a topic’s fine-grained aspects. Specifically, we study the coarse-grained topic from weibo stream to distill its fine-grained semantic dimensions;(2).We propose the concept of "Entity-Level Topic" for describing fine-grained semantic diminsion, so that the granularity of the topic is limited to well-understood entity level.(3). Based on the hypothesis above, an improved topic model Key-Anchored LDA is proposed combining named entities and noun. Solving the problem existing in classical topic model’s strong human intervention when setting the topic granularity generally determined by the Topic number k.(4). In this paper, we proposed a method draw emtion tags for a tweet post from Bernoulli distribution, reusing the tags to generate emtion topic, this method can effectively modeling the text contains emotion, and generate the corresponding dimension of emotional subtopic; Further more, we use the ’@’ tag generate conversation topic dimension, which characterizes the hot topic’s conversation level topic content.Based on the above points, we use tencent weibo API to crawl hot topic tweets for our proposed task and carried on the related experiment on the model proposed. The results show that the proposed method for analyzing the topic’s fine-grained aspects is effective and feasible. For a specific topic, our proposed method can extract the subtopic of fine-grained aspect information. In contrast with classic topic of LDA model results, the indicators such as PMI, Perplexity achieve better performance. And the introduction of entity information makes the generated topic more readable and stronger understandability.
Keywords/Search Tags:microblog, topic modeling, granularity, sementic dimension, Key-Anchored LDA
PDF Full Text Request
Related items