Font Size: a A A

A Similarity Measure Between News Topics And Blog Topics

Posted on:2014-10-09Degree:MasterType:Thesis
Country:ChinaCandidate:P C LvFull Text:PDF
GTID:2298330452463674Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the continuous birth of new media, how can a traditional media takeadvantage of the new media to develop their own have begun to receive attention. As arepresentative of the new media platform since Web2.0, the values of blog fortraditional media are growing. The news media use the hot topics discussed in blog todetermine the concerns and trends of relevant news in crowd. By finding the associatedblog topics for some news topics, we can supply some subjective comment for theobjective news reports. In this paper by using the topic model and with the help ofcombining the structure features and content features of blog post, we make a researchon the topic association between blogs and news.First of all, we use topic model to build the news topic model and blog topicmodel. We make some improving in the blog corpora, obtain the semantic informationof blog and news corpora. Then, we use Euclidean distance, cosine similarity,Hellinger divergence, Tanimoto similarity and Jenson-Shannon divergence tocalculate the similarity between topic models. After that, we propose an associationdetect method based on voting, taking advantage of the result from five similaritymeasures mentioned above to find the best result in topic association between newtopics and blog topics. Finally, we display the experimental results, and the result ofeach association methods, as well as voting model are evaluated and analyzed.The experiments show the performance of each association method in the work ofnew-blog topic association. Cosine similarity and Tanimoto similarity are proved to besuited for this task, while the voting model proposed in this paper obtain the bestprecision and high recall and f-score, which shows effectiveness of this method.
Keywords/Search Tags:Blog, News report, Similarity Measures, Topic Model
PDF Full Text Request
Related items