Font Size: a A A

Research On Real-time Topic Mining Framework And Algorithms Towards News Stream

Posted on:2014-09-10Degree:MasterType:Thesis
Country:ChinaCandidate:Q QiaoFull Text:PDF
GTID:2268330395989275Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Since the era of mobile internet access, with incomparable propagation velocity the Internet has become a very important way for people to express appeals、emotions and comments。Meanwhile, the hot and sensitive topics closely related to the reality are often caused by the Internet and then transfer and diffuse, which have a significant impact on public safety. Once becoming the attention and focus of Internet users, the network topic will produce a large number of relevant reports. As can be used for network opinion monitoring and public safety analysis, How to capture hot topics from network data stream efficiently and in real-time for has become a hot spot issue.Based on the application requirements towards news media monitoring, this paper has constructed a framework for real-time topic mining and integrated a real-time topic mining and a topic trend discovery algorithm on it, which addresses the following issues effectively:1) consistent with the characteristics of news, it depicts the topic model from many angles, i.e., time, place, person, institutions etc., which makes the topic model richer;2) By adding the sliding window to real-time clustering, it can filter out former historic data, eliminating the historic data’s impact on real-time topics;3)it extends the scalability of the topic trend mining algorithm through the parallel clustering technology;4)through the implementations of topic mining framework, it can be compatible with more optimized algorithm in future, which improve the platform scalability.At the same time, in order to better display topic modeling results, this paper set up a topic presentation platform website to visualize the results show:for real-time topic, extracting the most relevant tags, and make association with related video through the text similarity; for topic trend, performing statistics over different time horizon, intuitively show the development of a topic.The experimental results in the Chinese news set show that the proposed algorithm in real-time topic mining achieve a considerable accuracy and for the topic trend mining, experiments show that, through the parallelization technology, there has been a significant speed upgrade.
Keywords/Search Tags:network topic, public safety, topic mining, topic trend, parallel clustering
PDF Full Text Request
Related items