Font Size: a A A

The Research On Machine Learing Method And Its Applications For Intrution Detection

Posted on:2008-06-09Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z C LiFull Text:PDF
GTID:1118360272466754Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
Intrusion Detection, essential for the initiative protection of network information security, is an information security technology used to detect any incursions into a computer network. In view of the unknown security issues which the next generation internet may encounter, as well as the increasingly frequent distributed, multi-objective, multi-stage network attacks confronting us nowadays, it is imperative that Intrusion Detection System enhance its detection efficiency and intelligence. The Machine Learning Method is used in classification and prediction,which have come into use in the field of intrusion detection. Nevertheless, many problems have not been satisfactorily resolved including the heavy correlation between sample data, big number of duplicated training samples, long term of training and the difficulty in identifying the intrusion samples.PCA-IC algorithm, the features compression algorithm integrated by Principal Component Analysis and Immune Clustering algorithm, is designed in view of the potential relevance between the duplicated or similar samples of the features in intrusion detection and the feature parameters. This algorithm compresses data without losing their implied feature knowledge so as to deduce the number of samples for machine learning. In this algorithm, principal component analysis is employed before hand to remove the relevance between various parameters, following by immune clustering algorithm to eliminate similar samples. In the simulation experiments conducted over the KDDCUP99 intrusion detection data sets, sample compression rate reached 89%.Misuse detection is a modeling for the weaknesses of the known network systems and application software, so as to pattern match the observed users'behavior and their use of resources, which falls into the group of multi-pattern classification. As for the problems that the general multi-class support vector machines, which have to use both the classifier for calculation, deal with too many duplicated samples at a low speed with unsatisfactory real-time-ness, the paper presents Binary Tree with Priority for Multi-class Support Vector Machine (BTPM-SVM) algorithm. BTPM-SVM introduces in the concept of priority, according to which multiple support vector machines are structured into an asymmetric graded Binary Tree, where the number of SVM training samples decreases rapidly with the ascending of grades, thus greatly reducing the number of duplicated samples and enhancing the training speed. In the simulation experiments conducted over the KDDCUP99 misuse detection data sets, sample detection rate reached 96%, saving 57% of calculating time with the same number of data.Anomaly Detection distinguishes between normal and abnormal behavior of a system according to the network traffic characteristics and the host audit data. As for the problem that training samples in anomaly detection are unlabelled and unbalanced data sets, attack detection is treated as outlier detection and one-class SVM of hypersphere can be utilized to solve it. In the simulating experiment conducted over the sample data sets called by the "MIT lpr" system, which is provided by University of New Mexico, 1000 of the 1001 abnormal samples were correctly identified.Masquerader Detection conducts surveillance over the behaviors of the legitimated users in the system, preventing them from any non-authorized operation, or preventing other users from fraudulent use of these legitimate users'account for illegal or malicious acts. In this paper, a co-occurrence matrix two-dimensional modeling method is employed to accurately simulate the users'behavior. At the same time, principal component analysis is conducted to reduce the dimensions of the samples, which have so many of them. After that, the multi-class Support Vector Machine is used to identify the samples under processing. According to the performance test by SEA data sets, the sample identification rate reached 80.4%.To achieve a smooth transition from IPv4 networks to IPv6 networks and to ensure the safe and orderly operation of the next generation internet system, this paper, based on the abovementioned algorithms, designs and realizes an Intrusion Detection Prototype System based on Machine-Learning technology--MLIDS. This MLIDS prototype system, in simulating tests in IPv4 and IPv6 environment, have the detection rate of 97% and 98% respectively. This relatively high detection accuracy proves the effectiveness and practicality of the BTPM-SVM and hypersphere One-class SVM, as proposed in this paper.
Keywords/Search Tags:Intrusion Detection, Machine Learning, Support Vector Machine, Data compression, Eigen Correlation Matrix
PDF Full Text Request
Related items