Font Size: a A A

Research On Hot Topic Detection And Automatic Summarization For Chinese Microblog

Posted on:2013-06-13Degree:MasterType:Thesis
Country:ChinaCandidate:X X FeiFull Text:PDF
GTID:2298330467474652Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Microblog has become the most popular communicating tools and platform on the Internet. Millions of users share and express their feelings and opinions on Microblog every day. As a new platform for sharing and broadcasting quickly, it contains large scales of information. It has become the first spot of many important social events because the users can always be the first witnesses. In order to acquire the separating information of Microblog, also get to know the hot topics of it as soon as possible, and get the sentiment tendency of topics, we do research on the aspects below.In the first place, we need to detect the hot topics of Microblog. Hot topic of Microblog is very popular, and it happens in a sudden time and it lasts in only a short time. It is popular because users search the topic and talk about it many times. It was seldom or never brought up by anyone before it becomes a hot topic while thousands of users begin to talk about is after it happens. However, as the information in Microblog updates very quickly, a new topic can soon be hot while the old one no longer be hot. We bring up the hot topic detection technique based on these properties mentioned above.After acquiring the hot topic of Microblog, users need to know what it is really about. As we mentioned in the first paragraph, it is time consuming to read Microblogs one after one. So this is the second issue we need to solve. We simplifies this problem so that it is now about selecting a subset from the congregation of the given topic while the selected subset is the most relative with the topic and microblogs in it can describe the topics overall. Users tend to use the specified words to describe a topic which we call them the feature word set of one topic. We come up with automatically summarizing Microblog technique based on the feature set. In the first place, we compute the similarities between the feature set and all the microblogs and put the microblog which has the biggest similarity value with the feature set into the subset. Then we compute the similarities between all the other microblogs and the ones of the subset. The microblog with the biggest value is put into the subset until the subset is fulfilled. We come up the methods for microblog topic detection and automatic summarization. The experimental results show that our methods are effective.
Keywords/Search Tags:Microblog, Hot topic detection, automatic summarization, Natural languageprocessing
PDF Full Text Request
Related items