Font Size: a A A

High Speed Network Protocol Identification Research Based On Content Analysis

Posted on:2008-12-12Degree:DoctorType:Dissertation
Country:ChinaCandidate:S H ChenFull Text:PDF
GTID:1118360242999236Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Protocol Identification (also called Application-Layer Traffic Analysis) is the basis of Firewall, Intrusion Detection System, Content Auditing System and Network Management System. Nowadays, Protocol Identification for ISP Backbone faces many challenges, such as high traffic, new protocols, and unfeasible transport layer port matching method. Lately, some novel approaches have been proposed, such as method based on statistics and model, mathod based on packet deeply procession. But these methods fall short of performance, accuracy, expandability, and robustness.This thesis focuses on the Protocol Identification of the ISP Backbone. Based on the research of many Protocol Identification methods, a new method combining the software and hardware has been proposed, which is capable of processing complicated, high speed traffic with many concurrent streams. The main contributions of the thesis are as follows:(1) A High Speed Network Protocol Identification Framework has been proposed. The framework is composed of Protocol Identification Architecture (PIA), Communication Mechanism, Protocol Identification Method and Protocol Identification Description Language (PIDL). A high speed switch is used in PIS to connect several linecard, which solve unsymmetrical routing and routing balance problem. Master card controls the linecard, which take charge of compilation and loading of rules. We use PIDL to describe the protocol features, which solves the expansibility problem. Matching Engine has been implemented using customized hardware to enhance performance. Port matching, string matching and regular expression matching have been used to process the packet stream, which can improve the accuracy.(2) Port matching in Protocol Identification often includes discrete ports and port range. To realize the wire-speed port range matching, a range matching algorithm called LRC-RM is proposed. LRC-RM maps the port range into a compressed bit vector and organizes the bit vectors as an extended balanced binary tree. The corresponding bit vector is retrieved when matching packets, then the bit vector and other fields are sent to TCAM to obtain matching result. Experiments based on actual network are employed to show that NIDS using LRC-RM can perform wire-speed range matching for OC-192 links, while saving much memory resource comparing to the existing methods. LRC-RM can be easily implemented within a chip without additional RAM.(3) To solve the accurate string matching problem in protocol identification, we introduce a hardware detection method based on TCAM called Linking Shared Multi-Match (LSMM), which can effectively realize the Multi-Pattern Multi-Match of large capability of string rules.(4) The relationship between the single pattern FSM and the Multi-Pattern FSM (MPFSM) has been studied. MPFSM matching engine is simple in structure but need more storage memeory, and single pattern FSM need little memory but the structure is complex. In MPFSM, based on Thompson algorithm, an Epsilon Compressed NFA (ECNFA) Construction Algorithm has been put forward and implemented. This algorithm enhances the performance of conversion from NFA to DFA by decreasing the epsilon edges and the corresponding states. A One-Pass multiple-pattern protocol identification system has been implemented using the MPFSM and corresponding algorithms. In single pattern FSM matching engine, we introduces a Set Intersected Precode (SI-Precode) method, SI-Precode codes all input symbols before the conversion from Regular Expressions to NFA, and the space of FSM state transition table is then reduced by compressing the input symbols. Using the SI-Precode, we could put the state transition table into the FPGA without additional memory.Based on the above researches and with the aid of the National Grand Fundamental Research 863 Program of China and National Information Security Research 242 Program, an Application Classifying & Splitting Gather (ACSGather) for 2.5G POS bidirectional link is designed and implemented. The Accuracy and Performance of ACSGather are also been evaluated. ACSGather has been deployed since 2006, and has gotten positive effect.
Keywords/Search Tags:Protocol Identification, Application-Level Traffic Analysis, Packet Classification, Pattern Matching, Regular Expression
PDF Full Text Request
Related items