Font Size: a A A

Research On Intrusion Detection Based On Semi-Supervised Fuzzy Clustering

Posted on:2011-01-08Degree:MasterType:Thesis
Country:ChinaCandidate:R S WangFull Text:PDF
GTID:2178360308457255Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Intrusion detection system is a new generation of security technology after data encryption, access control, firewall and other traditional security technologies. By collecting and analyzing all kinds of information from key points of network or system, it can find out whether the network or system has been attacked or the security policy has been broken out. In this way, the intrusion detection system can take measures against these attacks or behaviors to protect the security of computer.Traditional intrusion detection system mainly focused on supervised learning. In the last 30 years of development, although traditional IDS has high detection rate, it has higher requirement in training dataset, which is the purity of training dataset must be guaranteed. Therefore, the cost of the whole system gets increased. In recent years some scholars have proposed intrusion detection algorithm based on unsupervised learning. Intrusion detection system based on unsupervised learning is characterized by that there is no need to classify the training dataset and it can detect efficiently these unknown intrusion attacks. However, after recent years of research, we find some disadvantages of it. Although it needs less resources than intrusion detection system based on supervised learning and it can detect unknown intrusion attacks without labeling and classifying the training dataset, without the supervision of labeled data, the performance of intrusion detection system based on unsupervised learning is much lower than those traditional detection methods.In general, the traditional machine learning method only takes one situation into consideration, which is either the labeled sample or the unlabeled sample. However, in the practical application, the two kinds of samples often co-exist or co-related in some cases. And how to make full use of these data so as to get useful information has become the research hotspot. Semi-supervised learning, which combines with information from both labeled data and unlabeled data for learning task, has drawn wide attention. Besides, because the semi-supervised learning is the key technology for resolving this problem, semi-supervised learning has also become a research hotspot in the field of machine learning in recent years.Cluster is one of the most important methods in the field of data mining. Fuzzy cluster analysis, which is combined cluster with fuzzy theory, provides the capability of fuzzy processing. And it has been widely used in many fields. However, once the traditional cluster algorithm made a classification on data, it can hardly be changed. So it is regarded as"hard clustering". Fuzzy cluster analysis is the key point of resolving this problem because of its fuzzy processing capability. Therefore, to a certain extent, it can reduce the false alarm rate and improve the system's performance by combining intrusion detection with fuzzy cluster analysis.In this paper, the intrusion detection models based on semi-supervised K-Means and based on semi-supervised fuzzy cluster are proposed by combining semi-supervised learning respectively with K-Means and fuzzy cluster. Detection module and response module are designed in detail. K-Means algorithm and fuzzy clustering algorithm are analyzed. Two methods: SK-Means (Semi-supervised K-Means) algorithm and SFCA (Semi-supervised fuzzy clustering algorithm) are proposed. And they are used to generate different detectors. The randomness of initializing K-Means algorithm will be reduced greatly by combining semi-supervised learning with K-Means algorithm. And by combining semi-supervised learning with fuzzy clustering algorithm, it can not only use the fuzzy cluster analysis's fuzzy processing capability but also use the supervision information from labeled data and unlabeled data to realize their coordination function. Finally, the proposed models and algorithms are simulated by KDD'99 datasets. Different results are got by using different parameters. The experimental results show that because of using fewer labeled data, the training speed of the two algorithms becomes much faster, and the detection rate and detection speed become higher. The experimental results also show that even if there's a lot of intrusion data, the two kinds of algorithms still keep high detection rate, and they are not easily affected by outlier. Under the same condition the intrusion detection system based on semi-supervised fuzzy cluster has better performance than that based on semi-supervised K-Means.
Keywords/Search Tags:intrusion detection, semi-supervised learning, fuzzy cluster, abnormal detection
PDF Full Text Request
Related items