Font Size: a A A

Research On Algorithm To Intrusion Detection Classification Based On Imbalanced Dataset And Decision Tree

Posted on:2011-01-08Degree:MasterType:Thesis
Country:ChinaCandidate:Z Q XiaFull Text:PDF
GTID:2178360308973205Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the fast development of information technology,network security increasingly becomes to be a challenge problem,so research and implement of Intrusion Detection System turns to be all important task in computer research and application fields.Data mining technique,which call analyze and process huge data automatically and efficiently and mine for latent role,regulation and pattern has been introduced into IDS research fields.Classification has been used in DM based IDS as an available DM technique.However,traditional classification methods have deficiency in IDS for the class distribution of IDS dataset is imbalanced.So new classification strategies should be introduced into class distribution imbalanced dataset classification problems.C4.5 algorithm is easy to understand principles of classification,it is easy to understand and accept to person,with simple and effective features. Especially in the classification of imbalanced data sets,the classical C4.5 algorithm seems to have become the baseline for comparison.In this paper,we analyzes the features of the current intrusion detection training set and classification approaches of imbalanced dataset.We proposed the two classifiers named CCBCE based on processor of under-sampling technique and C4.5 algorithms.The under-sampling processor adopt GCA Clustering algorithm and K nearest neighbor randomly under-sampling against the majority class,so that we more accurately remove the border, noise and redundant samples of the majority class to reduce the unevenlevel of training set.At the same time,we adopt Adaboost algorithm to build classifiers ensemble based on C4.5 algorithms named C4.5BCE as a second classification, avoiding the majority class miss useful information because of under-sampling, thereby enhancing the overall classification performance.Then we evaluate the performances for our proposed CCBCE on intrusion detection dataset KDDCUP99 Data from UCL,we compare the classification performances of our method with C4.5 algorithms and C4.5 algorithms based on the under-sampling processor. In addition,we make some experiments using different ensemble size parameters of classifiers ensemble C4.5BCE and analyze the results,which show that performance of classifying and detecting becomes better with increasing of ensemble size.When individual classifier of the ensemble reaches some amount,performance of the system turns to be stable.
Keywords/Search Tags:intrusion detection, imbalanced dataset, C4.5 algorithm, Under- Sampling technique, Adaboost algorithm
PDF Full Text Request
Related items