Font Size: a A A

Incorporating K-means, Triangle Area Support Vector Machine And Feature Selection Algorithms For Intrusion Detection System

Posted on:2010-04-04Degree:MasterType:Thesis
Country:ChinaCandidate:P J TangFull Text:PDF
GTID:2178360302960748Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
As the development of computer and network technology, people's working and daily life are increasingly rely on computer and Internet. Various important information have spreaded all around the world with the Internet. Although people can improve their working efficiency by using network, people cannot overlook the network security problems accompanied with. The network security problems not only involve in personal information security but the national information security as well. The dramatically developing attack technologies and malicious access, the security bug by nature caused by leaks of coding and design and substantial amount of computer viruses make the current security technologies such as firewall, identity authentication technology, operating system security kernel technology too weak to protect the system. The intrusion detection system with active protection, dynamical monitoring and system protection together have developed at a rapid pace in recent decades. Furthermore, the intrusion detection systems merging supervised and unsupervised machine learning algorithms have become the hot spot in current relative research fields.Based on studying on relative papers and academic resources from home and abroad, our paper proposes an intrusion detection system blending the clustering algorithm and an improved classifying algorithm.First, by calculating all features' information gain for every specific attack type, we delete the redundant and reduplicate features which are not play substantial roles in discrimination process from the KDD CUP 1999.Secondly, we employ K-means to cluster the remaining data after feature selection into five classes, for each data point in dataset, we randomly choose two points from the five clustering centroids and use those two points and the data point to construct ten triangles. Then we calculate the areas of ten triangles respectively and set the 10 areas as the new feature vector of this data point.Lastly, we apply the 10-fold cross validation and LibSVM to train and test the intrusion detection model based on the new feature vectors and obtain the final experimental result. Our system achieves accuracy rate of 99.83%, detection rate of 99.88% and false alarm rate of 2.99% on the 10% of KDD CUP 1999 evaluation data set. We also achieve a better detection performance for specific attack types concerning precision and recall.
Keywords/Search Tags:Feature selection, Triangle area feature representation, Machine learning, K-means, Support vector machine
PDF Full Text Request
Related items