Font Size: a A A

Research Of Misuse IDS Based On Sequential Pattern Mining And Key Technologies

Posted on:2006-03-11Degree:DoctorType:Dissertation
Country:ChinaCandidate:S J SongFull Text:PDF
GTID:1118360155472162Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Intrusion detection technique is the new generation of security assurance technology after firewall, data encryption and other techniques. With the increase of Internet data, the traditional intrusion detection technique of building model in manual manner does not accommodate the new network environment. In order to solve the problem of extracting knowledge from massive data, researchers put forward intrusion detection technique based on data mining. Along with the development of intrusion technique, many intrusion behaviors hide their signatures in the occurrence order of events. An individual packet or command looks normal, which has not evident detection signature in it, but a sequence of packets or commands in order compose an attack, and the attack sequence appears only once in an attack. In order to find out the rule of this kind of attack, we introduce sequential pattern mining algorithm into intrusion detection systems. The sequential pattern mining algorithm digs patterns with larger granularity than association rule algorithm. We gather multi-instances of the sequential attack as training data, find out attribute sequences of attack behavior which appear multi-instance but only once in each instance, and build a detection model with these attribute sequences. The sequential pattern mining algorithm overcomes the disadvantage of not reflecting the occurrence order of event in association rule algorithms, and detects application layer R2L (remote to local) and U2R (user to root) attack which is a difficult problem in intrusion detection at present, thus improves detection rate.The main research contents of this paper include: the research of intrusion detection technique, protocol analysis and behavior analysis, the research of data warehouse technique, data mining and sequential pattern mining technique and so on. After studying the IDS based on data mining, the problem of intrusion detection, and the development of data mining technique at present, we present a misuse IDS in application layer based on sequential pattern mining, and obtain the following results: 1. We present a framework of misuse IDS based on SEquential pattern Mining —SEMIDS.After analyzing the advantages and disadvantages of all kinds of IDSs based on data mining, and applying the sequential pattern mining algorithm, protocol analysis and behavior analysis, and misuse detection technique to IDS, we present a framework of misuse IDS based on SEquential pattern Mining—SEMIDS, which overcomes the disadvantage of statistic analysis method, extracts occurrence order information of event, and detects sequential attack whose signature appears only once.We put forward a method of extracting and storing attribute sequence of application layer, present a sequential pattern mining algorithm which fits intrusion detection data format; implements pattern match and pattern comparison with candidate chains structure; scores and redirects matching data with score mechanism, which can record attacks behaviors and all suspect behaviors even though the intruder has the privilege of administrator; adds self-learning capability with self-accommodation mechanism, which can rebuild detection model for transformed attacks; and takes suspect behaviors of other IDSs as input of decision mechanism for building an open system with state transfer analysis. Experiments indicate that the system can describe attack behavior of intruder and build attack sequential attributes model of intruder, and facilitates to R2L and U2R attack detection.2. We present a method of extracting and storing attribute sequence of application layer protocol.Using the ordered structure of network protocol, behavior analysis not only gets application layer information from individual packet but also composes multi-packet application layer session and recovers the attack behavior of intruder in application layer. Behavior analysis is the senior phase of intrusion detection.> We use protocol and behavior analysis tools Bro and NetMonitor to extract application sessions. Considering the requirement of sequential pattern mining algorithm, in order to facilitate coding without losing of intrusion information, we parse Telnet, FTP, HTTP protocols and netcat respectively, present a method of comparing, selecting and combining attributes of these application layer protocol, and extract attribute sequence of application layer.> In order to acquire better sequential pattern minng performance, we build a ROLAP star-type data warehouse with Microsoft SQL Server 2000 Analysis Services, and set up six dimensions and four measures to store telnet, FTP, and HTTP protocol attribute. Warehouse can store these attributes with same format and arrange them from different dimension, which can describe an attack from various point of view and clean "noisy" data, and offer qualified data for sequential pattern mining. Warehouse takes information from different IDSs as input of decision mechanism. As an important part of self-accommodation mechanism, warehouse offers data for building new detection model.3. We present an efficient sequential pattern mining algorithm which fits intrusion detection data format.> We present an efficient sequential pattern mining algorithm HVSM-1 (afirst-Horizontally-last- Vertically scanning database SEquential pattern Mining algorithm). The algorithm changes transaction-item horizontal structure to vertical structure which decreases the times of scanning original database effectively, and uses Large-itemset Reuse method to enlarge mining granularity, resolves the problem of counting candidate sequential patterns with bit-map method, and speeds up mining sequential patterns with pruning candidate sequence. The experiments indicate that HVSM algorithm is superior to SPADE and Prefixspan algorithms.> In order to meet the requirement of intrusion detection data format, we present a sequential pattern mining algorithm HVSM-2 based on the research of HVSM-1. The original data of algorithm HVSM-1 have N attributes, we select 0 to N attributes of a record each time, when we perform sequential pattern mining. Compared with algorithm HVSM-1, the original data of algorithm HVSM-2 have N attributes, each attribute has M attribute values, when we perform sequential pattern mining, we select 0 to N attributes of a record each time, and can and only can select one attribute value from M attribute values. Algorithm HVSM-2 fits for sequential pattern mining of application layer intrusion detection data, and produces satisfactory results in detecting R2L and U2R attacks. The research work of this dissertation is supported by some pre-research projects, and theresult offers the support for the development of the project and will instruct the research ofmisuse intrusion detection based on data mining.
Keywords/Search Tags:intrusion detection, data mining, sequential pattern, behavior analysis, association rule
PDF Full Text Request
Related items