Font Size: a A A

The Design And Implementation Of The Hot Topic Detection System Based On The Improved Single-Pass Algorithm

Posted on:2016-02-03Degree:MasterType:Thesis
Country:ChinaCandidate:P W ZhangFull Text:PDF
GTID:2308330464472800Subject:Computer technology
Abstract/Summary:PDF Full Text Request
So far the Internet have been from the emergence to flourish and it is also playing an increasingly important role and influence on people’s economic and social life,now we have already entered an era of unprecedented wealth of information.However, in the present case, on one hand along with the network of information explosion, we can get more information,but on the other hand we also need to face the information’s messy and disorder,which have made us fail to manage and find valuable information.Therefore,to get a tool that allow us to get the information we need have been become the urgent needs.The appearance of search engines alleviates the information overload of stress to some extent.we can search for their useful information by entering keywords,but precisely because it uses a keyword matching technology, and does not filter the results,therefore, the information returned is bound to have a high degree of redundancy of information and a lot of irrelevant pages, which contains a key part of the return as a result, and then we have to spend much more time of ourselves in accordance with our own needs.For hot topics, search engines become more powerless, . and now a hot topic or event usually through online voting or some expert in this field to produce artificial selection, and therefore has a certain subjectivity.To solve the problems above, through analysis of existing technologies and achievements, this paper designs and implements the following:(1)By analysis the demand for hot topics detection,we build the framework of the hot topic detection system,and solve the problem of its system architecture design and processing aspects.(2)Combined with the present relevant research and technology at home and abroad,on the implementation process of the hot events detection system,we design and implement the information collection modules, the information preprocessing modules, the topic detection modules, identification of hot topics modules and user account management modules and so on. In order to reduce the computing system to reduce complexity and improve accuracy, we improved the Single-Pass clustering algorithm which is a text mining that was used,on clustering strategy text vector representation and the similarity calculation,Finally, the heat is calculated in accordance with the heat of the topic and then we used the calculation method sort the topic and we use the web technologies to show the related information which is extracted,the final show after this series of system design methods described herein can be extracted and find some hot topics.
Keywords/Search Tags:Feature Selection, Similarity calculation, Text clustering, Single-Pass algorithm, Topic Detection System
PDF Full Text Request
Related items