Font Size: a A A

Research And Implementation Of Intrusion Detection Based On Clustering Algorithm

Posted on:2010-01-14Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y ZhangFull Text:PDF
GTID:2178360272497572Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
Along with the rapid development of internet technology, network environment becomes more complex with, and network security becomes more and more important. Intrusion Detection System is an important filed in computer system, it plays an important role in the network security. As proactive defense attacks, intrusion detection is a complementary role for the instrument such as firewall. There are two categories of intrusion detection, including technology: anomaly detection and misuse detection. Misuse detection is a method based on the known attacks or system to identify clear invasion of vulnerability. The disadvantage of this method is that it can't detect unknown attacks. The signature database should be updated regularly. Anomaly detection method is based on a normal model for the formation of conventional status database. It determines if the behavior is outlier or not according to the distance between the signature and normal profile. However, the traditional model of intrusion detection has more and more disadvantages such as low adaptability because of the complexity of network environment. Even if the anomaly detection methods are unable to adapt to the changing of the network environment, which leads to low detection rate and high false positives.Data mining is a generic knowledge discovery technology, it is a process of finding model and the relationship of the data in a large amount of data by analytical tools, and it is very useful for the intrusion detection. It can improve the intelligent, self-adaptability and expansion of the intrusion system, and improve the quality of intrusion diction system.This paper mainly consists of three parts as followings:The first part is the introduction of intrusion detection technology and basic knowledge of data mining, and introduces a density algorithm based on clustering. Density-based clustering algorithm is to have a certain density of the data space data as a cluster. This method is able to overcome the flaw of other algorithms based on distance which can only be found in"ball"type. This method needs arbitrary shape of clustering results. The idea of the approach is that when the number of points of a region is greater than the density of a value, they are added into similar clustering.The second part is to introduce DBSCAN algorithm. DBSCAN is a typical algorithm of density clustering algorithm. There are many algorithm derived from it. The theory of AIDM algorithm proposed in this paper is the same as DBSCAN. DBSCAN algorithm requires users to input data, including clustering radiusĪµand the smallest number MinPts. But for AIDM they are MinSup (minimum support) and clustering radiusĪµ. It could be more convenient by changing the impact factor MinPts in the process of training to adjust the profile of clustering. So we use MinPts in our AIDM algorithm. The data formula in AIDM algorithm is relatively simple and the algorithm complexity is low. So this algorithm not only decreases system memory but also enhances the real-time feature of IDS.We improve our methods as mentioned in the third part, by reading the many references, selecting the incremental cluster algorithm in intrusion detection system. At the same time we choose one method to make the pick-up data preprocess.As above-mentioned clustering algorithms are static algorithms, the normal behavior profile of cluster can not be changed after clustered; it must be re-clustered to change the profile. This requires retraining data of the profile, and it will take long time and a lot of work. There is incremental DBSCAN algorithm is also introduced in the paper. We attempts to use the concept of incremental improvements AIDM algorithm based on incremental cluster to solve the profile of static clustering and dynamic to adjust the normal behavior profile. And the experiments show that the algorithm is effective.There are many kinds of influence for the incremental clustering used in the AIDM algorithm. This paper put into four categorizes. Increase data noise. Form a new cluster. The data is absorbed by original cluster. Unite two clusters; Delete data: Noise. Delete a cluster. Delete a data only. Split a cluster into two clusters.In order to improve the experiment results, we select the data processing algorithms. To provide high quality data, there are two assumptions for the clustering used in intrusion detection system. The first one is that the number of normal data is much larger than the number of abnormal data during training. And the second one is that the data characteristics are different between the normal data and abnormal data. In process of experiment, we choose all the normal data in the training stage. The characteristics of the original data become even more evident as we use the TD-IDF weighting frequency algorithm of the data preprocessing. We extract the original data of audit records in DARPA data sets, and deal with the data used in the TD-IDF algorithm. We constitute a profile of normal behavior and experiments have shown that the method is correct and feasible.In this paper, we do many experiments in KDD99 and DARPA data sets. To determine whether it is normal or not when a new data packet comes in, if the data is abnormal, the system generates alarms or record the package information as security logs. The result of experiments indicates that, in KDD99 data sets the detection rate and false rate of AIDM algorithm are 98.19% and 3.91%, and the detection rate and false rate of the incremental AIDM algorithm are 98.52% and 2.10% after improved. We take the comparison between original static clustering algorithm and the incremental clustering algorithm, and between the incremental clustering algorithm and ADWICE algorithm, and we get better results. The detection rate of incremental AIDM algorithm is a little lower than ADWICE, but the false rate is much lower than ADWICE. The intrusion detection model is proved feasible and effective through many experiments in this paper.
Keywords/Search Tags:intrusion detection, anomaly detection, incremental clustering, density-based clustering
PDF Full Text Request
Related items