Web Content Filtering Key Technologies And Research

Posted on:2008-10-24

Degree:Master

Type:Thesis

Country:China

Candidate:G S Xu

Full Text:PDF

GTID:2208360212499626

Subject:Communication and Information System

Abstract/Summary:

PDF Full Text Request

With rapid development of network, people are enjoying more and more great advantage provided by network, and on the other hand, they are also greatly suffering from constant security threat accompanied by network itself, such as secret leak-out, virus, and reactionary words, which are covering the network here and there. To ensure network content safe or not, network content filtering system scans content of sessions between users and provides original information for audit.The paper presents an archetypal system for network content filter, which is able to scan and filter network packets with low packer loss. Network Driver Interface Standard (NDIS) is adopt in the module to capture and process network packets, and it provides a base for designing and testing algorithms in Windows kernel. For the packets captured form the date link layer are of little use to analyze the session content between the users , an efficient method is introduced to assemble the TCP/IP packets.Keyword filtering is performed by pattern matching algorithms and becomes the performance bottleneck of network content filtering system actually. Therefore, this paper analyses the existing string matching algorithms including classic single-pattern and multi-pattern matching algorithms, and then compares the algorithmic performance with each other, which makes preparation for the design of more efficient algorithms.The classic matching algorithms are made for English words. Network content is often combined with several languages, Chinese, English and other languages, and its character set is extremely large. Unlike English, there is no more prefix and suffix in Chinese. The paper designs and implements two algorithms for Chinese character set combined with other languages. One is named CE_BM algorithm which is made for FPGA and DSP, and it costs much less memory than CE algorithm but achieves the same performance with bit operations instead of string operations. The other, AWM algorithm, is based on the famous multi-pattern matching algorithm, WM algorithm, and becomes more suitable for Chinese words. Meanwhile, experiments are made on the two algorithms under circumstance of Chinese texts combined with English and other languages. The result of experiments shows that the two algorithms have better performances than their original ones. Finally, the system is tested under the similar circumstance and the experimental result shows that the system has the capability to filter the network content of LAN.

Keywords/Search Tags:

Content Filtering, String Matching algorithm, NDIS, TCP/IP Protocols

PDF Full Text Request

Related items

1	Research On Key Technology Of Networkcontent Filtering Based On NDIS Intermediate Driver
2	Approximate String Matching For Chinese Characters By Combining Filtering And Bit-parallelism
3	String Matching Algorithm Design And Implementation, Based On The Hierarchical Classification Of Web Content Monitoring System
4	Hardware-based String Matching Algorithm In Network Content Analysis
5	The Design And Implementation Of Content Filtering Firewall
6	Research Of String Matching For Internet Content Filtering
7	Key Techniques Of Data Filtering On High Speed Network
8	Research Of High-Performance String Matching Technology For Large Scale String Set
9	String Matching Algorithm And Application Of Network Content Analysis
10	Approximate String Matching Algorithms And Optimizition Techniques Using Local Filtering