Font Size: a A A

Semi-supervised SVM-KNN And Application In Intrusion Detection

Posted on:2011-05-04Degree:MasterType:Thesis
Country:ChinaCandidate:X R LuoFull Text:PDF
GTID:2178360308953737Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
Intrusion detection is an important technology of network security, not only detect external attacks, but also can detect unauthorized activities of internal users, playing an important role in the field of network security. With the development of computer and data storage technology, the data transmitted in the real-time network are hundreds of millions, which demand the intrusion detection technology to deal with the data strongly and quickly.In the field of machine learning, there are many practical problems which contain limited number of labeled samples and a large number of unlabeled samples. The samples labeled with corrected tags are difficult to obtain while unlabeled samples are easy to gain. So machine learning faces a major problem: how to extract useful information from the unlabeled samples and combine with the small amount of labeled samples to improve the learning effect. Semi-supervised learning came into being as the times require, it can use limited number of labeled samples and abundant of unlabeled samples to improve the classifier's performance and has been the focus of research in the field of machine learning.In order to solve the problem above, this paper presents a classification algorithm SVM-KNN based on Semi-supervised learning. In the process of SVM classification, the decision boundary dependents on support vectors, so picking out them into training set can improve the classifier's performance. Firstly, training a weak classifier SVM with the limited number of labeled data available, the original training set is small. Secondly, utilizing the classifier of KNN to select some support vectors from numerous of samples, then mixed them with the original training samples. Last training SVM in the new training set, the decision boundary is corrected iteratively. This algorithm exploits the useful information in the unlabeled data to improve the classification accuracy.This algorithm is applied into intrusion detection technology. Carry out the experiment on KDD99 data set, the experimental results show that the proposed algorithm improved the detection accuracy, reduced the detection time. Because intrusion detection data are high-dimensional, so we use the deformation algorithm of a single optimal combination of features to reduce the data dimensionality, choose the valuable attributes to optimize the algorithm in intrusion detection, at last the experimental results testified the effectiveness of the algorithm.
Keywords/Search Tags:Semi-supervised learning, Support vector machines, K-Nearest neighbour, Boundary vectors, Intrusion detection
PDF Full Text Request
Related items