Font Size: a A A

Objectionable Information Filtering System Based On ATN Algorithm And Latent Semantic Indexing

Posted on:2012-06-27Degree:MasterType:Thesis
Country:ChinaCandidate:Z Q SuFull Text:PDF
GTID:2218330338967263Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In recent years, the Internet has made rapid development, information in the Internet is increasingly abundant, growing exponentially, until now, there are hundreds of millions of websites around the world. However, everything has two sides, while the rapid development of the Internet, a lot of bad information unbridled spread to the the Internet, has brought about tremendous negative impact to the social order and people's normal life.This paper described the grim situation of the current Internet. Summed up the concepts, characteristics, and related products of the current three common objectionable information filtering technology. Analyzed and compared the filter results, advantages and disadvantages of the three types of filtering technology. On this basis, we designed an objectionable information filtering system based on Augmented Transition Network (ATN) and Latent Semantic Indexing (LSI). The ATN algorithm used in the system instead of the traditional segmentation based on string matching algorithms, this algorithm is based on Chinese lexical, grammatical rules, so the results of segmentation has a higher accuracy. Replacing the simple traditional vector space model to filter information, the algorithm has a good effect on dealing with Chinese "Polysemy" and "More word one meaning". In order to put the results of filter into practice, the system marks the results by PICS label and stores it in the database for application. Finally, to verify the filter performance of the system, we did experiments on Data Sets of Chinese E-Mails (CDSCE) provided by China Education and Research Network Emergency Team (CCERT), the experiment results show that the system design reasonable and filter performance better.
Keywords/Search Tags:Objectionable Information Filtering, Augmented Transition Network, Latent Semantic Indexing, Platform for Internet Content Selection
PDF Full Text Request
Related items