Font Size: a A A

Hot Topic Detection On Bbsusing Aging Theory

Posted on:2011-07-04Degree:MasterType:Thesis
Country:ChinaCandidate:D H ZhengFull Text:PDF
GTID:2178360308452435Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
BBS(Bulletin Board Systems) is one of the most common places for threaded discussion. It becomes more and more popular among web users, especially in China. Everyday a huge amount of new discussions are generated on BBS. It is too di?cult to find hot topics. To solve this issue, we made an in-depth study of post publishing, user activities, topic discussion in BBS, and put forward the novel approach of hot topic detection based on BBS using aging theory.First of all, the language on BBS is more colloquial, no-standardization, and a large number of abbreviations filled in post content, even an experienced BBS user feels difficult to understand them. We extract the characteristics of posts to filter invalid ones, thus guaranteeing the quality of the extracted contents of topics. Secondly, BBS has a very unique way of topic discussion. Our study found that, although every day it will produce a lot of posts during discussion, but most only focus on small amount of topics. For instance, when a topic of social emergency arises, users may discuss under the same theme in different aspects, or when a controversial topic happens, users will participate in different themes to discuss the same topic. In this paper, incremental clustering technology is employed for an effective organization of topic content. Finally, there exists the topic migration in BBS discussion. Theme vector construction based on post position makes a good solution to this problem.In order to identify a hot topic, we have a clear definition of hot topic on BBS, which contains four characteristics: massive posts, high quality post content, high cohesion and obvious burstiness. Around this goal, we propose our hot topic detection method on BBS using aging theory. First, we need to make a pre-processing on BBS data, and through incremental single-pass clustering method to obtain the topic candidates. Secondly, we use aging theory to calculate the energy value for topic candidates. Two steps above are both carried out incrementally. Finally, according to the definition of hot topic, we adjust the topic set involved, then rank the topics according to their energy value with descending order. We will get hot topic on BBS that we need.Experiments performed on practical BBS data show that our method is quite effective. First of all, our method can find hot topics more comprehensive. Second, due to in-depth study of BBS features, we can find hot topics which the traditional method can not. Third, topic rank according the energy value guarantees the hot topic found with more timeliness. Finally, the method can find hot topic not only the one last for a relatively short period of time, but also the one last for a long period of time.
Keywords/Search Tags:Hot topic detection, Aging theory, BBS
PDF Full Text Request
Related items