Font Size: a A A

The Research And Design Of Internet Event Automatic Detection System

Posted on:2014-12-22Degree:MasterType:Thesis
Country:ChinaCandidate:X ZhaoFull Text:PDF
GTID:2268330392973676Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In recent years, along with the constant improvement of network infrastructure,the Internet technology has made leaps and bounds development,various Internetservices have taken great convenience to people’s lives, the Internet has become anindispensable part of people’s daily life and an important way for people to getinformation. The advancement of network technology has greatly improved the speedof information collection and dissemination, and the amount of information wasgrowing toward the massive direction. Facing the complex flood of information on theInternet, people fall into a dilemma of data was mass but knowledge was scarce.People often surf the Internet to get the news information. However, since the reportsabout one event are often distributed in different news sites, only through thoseisolated information, people cannot have comprehensive understanding of an eventfrom massive information. How to build an Internet-oriented event automaticidentification system, which can get the news reports about one topic through miningthe data collected from different sites, was a very important research subject.The topic detection and tracking research is an intelligent information processtechnology of investigating how to detect new event and track the subsequentdevelopments of this event. It mainly uses the data mining and natural languageprocessing technology to converge the news from various sites and organize thembased on events, therefore, people can have a comprehensive understanding of oneevent at one Web site.Firstly, the article introduces the background and significance of the Internetevent automatic detection research as well as the research status of topic detection andtracking technology at home and abroad. Secondly, it studies the topic detection andtracking theory and its related technologies such as topic model, text feature selectionmethods and text similarity calculate. Then this paper mainly discusses the traditionalSingle-Pass clustering algorithm. According to the actual requirements of the Internetevent automatic detection system for text clustering algorithm,it is necessary tooptimize the traditional Single-Pass clustering algorithm in order to make it suitablefor real-time news stream clustering. Finally, it uses the flow of news on the Internetas a process object, the improved Single-Pass clustering algorithm as a main methodto develop the Internet event automatic detection system.The research of this subject involves natural language processing, data mining,and so on. The Internet-oriented automatic detection system not only has a certainreference value for Web data mining research, but also has great application value onthe real Internet.
Keywords/Search Tags:Web Mining, Topic Detection and Tracking, Text Clustering, Single-PassClustering Algorithm, Internet News
PDF Full Text Request
Related items