Font Size: a A A

The Design And Implementation Of Bad Information Filtering Technology Based On Internet

Posted on:2007-10-22Degree:MasterType:Thesis
Country:ChinaCandidate:M J JiaFull Text:PDF
GTID:2178360185966498Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
As a sign of information age, Internet offers millions of information and makes the people's daily life easy. But among millions of information, some bad information is included, which seriously and harmfully affects the society. It appearances under this kind of condition that the bad information filtering technology based on Internet. The international Information Filtering refers to identify the illegitimate text that includes ill content and takes out them. Along with the increase of the illegitimate text in WEB, it has become a new study domain of Information Filtering.Most of the information filtering systems existed is key words based or rules based. There are also content-based systems. Research background as well as domestic and international results gained in the information filtering field are introduced briefly in this thesis. At the same time, theories and technologies involved in are explored systematically, such as classification of information filtering systems, main mathematics models and text classification algorithms, etc.After studying and evaluating kinds of algorithms used in information filtering systems, the thesis poses a two-step approach aiming at improving performance and efficiency of information filtering. Namely, the first step bases on key words filtering and IP address filtering, and the second step bases on content of text.The characteristic of illegitimate text has been roundly analysis, and we summarize the content and vocable feature of illegitimate texts. And having discussed in detail the crucial technologies adopted by the system ,which are participle technology and feature draw technology. Maximum Matching.the X~2-test and k-Nearest Neighbor technology,which are applied in illegal web page filtering system. The information filtering is implemented on Windows operation system.
Keywords/Search Tags:information filtering, illegitimate text, participle technology, feature draw, KNN algorithm
PDF Full Text Request
Related items