Font Size: a A A

Finding Topic Based On Evaluation The Credibility

Posted on:2015-12-16Degree:MasterType:Thesis
Country:ChinaCandidate:X G LiFull Text:PDF
GTID:2348330518470621Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With innovative information production and network transmission mechanism, the micro-blog platform has became social hot important birthplace of social spots and route of transmission, which makes the hot topic detection technology research and develop. However,due to the anonymity and convenience features of micro-blog,it makes some fabricated untrusted topics become popular at the same time on the Internet. Therefore, for the credibility of the information of the micro-blog research, it is critical that this not only affects the entire network public opinion direction, but also affects social stability and harmony.In the credibility of the former research, it is based on some existing hot topic credibility evaluation, finally verifying the confidence value and the reality of the topic. Some untrusted topic, however, has been as a hot spot of network spread, and doesn't stop the spread of these topics from the source. Therefore, this paper topic related method, found micro-blog information and combining with the characteristics of a credibility research is put forward based on the reliability evaluation of topic discovery method.This paper puts forward the concept of believable hot topics, through the definition of a trusted hot topic and evaluation indicators, discoverying framework architecture which is a reliable hot topic. In the framework, it mainly includes user credibility evaluation, microblog data preparation, topic extraction, credible hot topic. On the user's credibility evaluation algorithm, this paper not only consider the effect of the credibility of the basic attribute of the user to the user, also puts forward an improved, based on the relative reliability of PageRank user reliability allocation algorithm. The algorithm mainly considers the influence of different quality fans credibility to the user, so that the reliability allocation is more reasonable. During micro-blog information pretreatment process, this paper considers the time for an event and the importance of time as an important factor in the text similarity calculation. By analyzing the event and the topic of time distance, this paper calculates the possibility of the events related to this topic. At the end of the paper, it is based on Single-Pass the text vector clustering algorithm. When using this algorithm, considering the order the algorithm relies on the text, as well as in the micro-blog comments, and forwarding more often as a communication, the characteristics of hot before clustering, firstly calculates the hot micro-blog, then according to the heat of the micro-blog sort, then clustering. Finally,according to the credible hot topic evaluation index and evaluation, the clustering of the subject gets reliable hot topic.Finally, this paper use the data on the micro-blog platform, analyzing this user credibility evaluation algorithms mentioned in the article, the influence of the time factors on the hot topic detection index and credible hot topics of the TDT experiment. In experiment users'credibility evaluation algorithm is compared with traditional PageRank algorithm,hot topic time factor and time factor in the hot topic, while users' credibility topic discovery algorithm is compared with SP&HA algorithm. This paper demonstrates that the proposed topic discovery method is based on reliability evaluation of found in accuracy and efficiency in the process of the subject.
Keywords/Search Tags:micro-blog, topic detection, credibility, Single-Pass, Trusted hot topic
PDF Full Text Request
Related items