Font Size: a A A

Research And Implement On Bad Information Filter Based On Concept Network

Posted on:2009-10-24Degree:MasterType:Thesis
Country:ChinaCandidate:D L SunFull Text:PDF
GTID:2178360242976845Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet, more and more bad information, including reactionary, violence and pornography, can be conveniently diffused over Internet without constraint. At the same time, traditional Internet content security surveillance technologies based on the origination of information, URL and classification system have been facing new challenges because of the raising of new phenomenon, such as web2.0, blog and p2p etc. One feasible method to solving this problem is based on the content that the information publisher posted, which will avoid the disadvantages of the traditional technologies.Concept is the abstraction and generalization of objects' features. Comparing to words and expressions, concept has high level abstraction. Each concept could be expressed by either one or more Chinese characters, and concept network composed by concept nodes and theirs relationships is a kind of network architecture.In this paper, we present a way to implement concept network and filter method with the goal of filtering bad texts from normal texts. At first, we present a method to extract concept from corpus based on the characteristic of bad information text, then present two measurements to calculate the similarity value of concepts by using fuzzy set membership. Consequently, we construct a three level recursive concept network, which generated by clustering low level concepts to high level concepts.Based on the features of the concept network, this paper also presents a filter method implemented by conducting the concept network, which is the process of estimating the similarity between each concept and text. According to the result, we can judge whether or not one text should be obstructed. And the experiment shows that this method has high accuracy and speed.
Keywords/Search Tags:INFORMATION FILTER, TEXT CLASSIFICATION, CONCEPT NETWORK, FUZZY SET
PDF Full Text Request
Related items