Font Size: a A A

Research Of Content Filtering On Multimedia Information

Posted on:2014-02-14Degree:MasterType:Thesis
Country:ChinaCandidate:Z L GuoFull Text:PDF
GTID:2248330395998605Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
The Internet offers a great convenience to our life, but it also provides a breeding ground for pornography, violence and other sensitive information, so how to purify the network becomes a top priority in recent years.This article focuses on multimedia content filtering and develops a filtering system which has three layers, namely the URL filtering, text filtering and image filtering. The text filtering offers two strategies:keyword filtering and content filtering based on stratification.In the traditional method, the keywords were set by hand, which has a strong subjectivity. In this paper, the samples were obtained from the Internet by web crawler, then, the keywords were extracted from the samples according to a certain strategy.In this paper, the text of the web page was divided into three layers, in which the title, description as the first level, the body as the second level, and the content of the hyperlink as the third level. Because the text in different layers supports the theme of the page differently, so different levels should have different weights, the higher the level is, the bigger the weight is. Take consideration of the semi-structured feature in the web page, this paper extract the three layers by using regular expressions.In the process of the text, the text was segmented and marked with part-of-speech, and six types of important words were extracted from it. The six types of words are nouns, verbs, adjectives, adverbs, pronouns, premises word. After this step, the dimension of the text and the complexity of the subsequent processing were reduced.The traditional weight function only considers the word frequency, this article also take the length of the keyword, the layer in which the keyword, and a simple semantic relationship between keywords into consideration. The test result shows that the recall rate and precision rate were improved effectively.In the image filtering system, both skin color filtering and face recognition were taken into consideration, the accuracy of the filter was greatly improved.
Keywords/Search Tags:Content Filtering, Text Filtering, Stratification, Weight Calculation
PDF Full Text Request
Related items