Font Size: a A A

Clustering Algorithm Based On Rough Sets And Its Application In Intrusion Detection

Posted on:2016-05-04Degree:MasterType:Thesis
Country:ChinaCandidate:D L ZhangFull Text:PDF
GTID:2308330503960023Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In recent years, the security problem of Internet is growing seriously. The network attack methods have also become diversified, complicated and intelligent. The traditional static defense technology, such as firewall, data encryption etc, can not meet our demand for network security. Recently, as a kind of active and dynamic security defense technology, intrusion detection technology has developed rapidly and obtained much attention. However, the current intrusion detection methods still have many problems in the practical application, for instance, the low detection precision and high false alarm rate.To solve the problems of current intrusion detection methods, in this thesis, we use the k-modes clustering algorithm to detect intrusions. As an efficient extension of k-means algorithm, k-modes algorithm has many merits. However, the current k-modes algorithm has many problems to be solved, which includes:(1) The definition of distance metric is not appropriate;(2) there does not exist an effective mechnism to select the initial cluster centers.To make the k-mode algorithm more applicable to intrusion detection, in this thesis, we use rough set theory to solve the problems of current k-modes algorithm. First, to solve the problem of the selection of initial cluster centers for k-modes clustering, we use the concepts of rough entropy and attribute significance in rough sets to calculate the weight of each attribute, and propose a novel algorithm for the selection of initial cluster centers. Second, we propose a new k-modes clustering algorithm, and apply it to intrusion detection, from which we can obtain a new unsupervised intrusion detection model. The proposed intrusion detection model does not need to label the raw data beforehand, and can accurately and quickly detect intrusions from the categorical data sets. Hence, our model can solve the problems of current intrusion detection systems to a certain extent.The main work of this thesis includes:(1) The weighted density and weighted overlap distance-based algorithm for the selection of initial cluster centers. To solve the problem of the selection of initial cluster centers for the current k-modes clustering algorithm, in this theis, we proposed an algorithm called Ini_Weight for the selection of initial cluster centers based on the weighted density and weighted overlap distance. In this algorithm, we select the initial cluster centers by calculating the density of objects and the distance between objects, and in calculating the distance between objects and the density of objects, different attributes are assigned with different weights according to their significances, which can effectively reflect the difference between different attributes. We evaluate the performance of Ini_Weight algorithm on some UCI data sets. The experimental results show that Ini_Weight algorithm can accurately select the cluster centers.(2) The weighted overlap distance-based k-modes clustering algorithm. Based on the Ini_Weight algorithm, we further propose a new k-modes clustering algorithm called WODKM based on the weighted overlap distance. In WODKM algorithm, we use the Ini_Weight algorithm to select the initial centers, and use the weighted overlap distance metric to calculate the distance between objects, which can avoid the problems of the traditional k-modes algorithm.(3) The unsupervised intrusion detection model UIDM_WODKM. We apply the WODKM clustering algorithm to intrusion detection, and obtain a new unsupervised intrusion detection model UIDM_WODKM. The proposed model detects intrusions by the following two steps. First, we divide the clusters in the clustering result into normal clusters and abnormal clusters. Second, for each object x to be detected, we calculate the weighted average density of x in each cluster, and the weighted overlap distance between x and each cluster center. We evaluate the performance of the proposed model on the KDD Cup 99 data set. The experimental results show that the UIDM_WODKM model is an efficient unsupervised intrusion detection method.
Keywords/Search Tags:Clustering analysis, rough set theory, weighted average density, weighted overlap distance, initial cluster center, intrusion detection
PDF Full Text Request
Related items