Font Size: a A A

Research On Unbalanced Intrusion Data Detection Based On Oversampling

Posted on:2020-09-05Degree:MasterType:Thesis
Country:ChinaCandidate:Y XiaoFull Text:PDF
GTID:2428330623965348Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Aiming at the problem that current intrusion detection system has less training on unknown attacks,resulting in low detection rate of specific attacks,an intrusion detection method based on Synthetic Minority Oversampling Technique(SMOTE)of maximum dissimilarity coefficient density combined with DBN and GBDT is proposed.Firstly,SMOTE oversampling algorithm based on the maximum dissimilarity coefficient density is applied in the data preprocessing.The mean value of the maximum dissimilarity coefficient between samples within the neighborhood radius is selected as the maximum dissimilarity coefficient density of the point.The oversampling base set is selected by the density threshold of the maximum dissimilarity coefficient in the class,then increase the number of minority classes based on SMOTE.After that,the DBN is used to extract low-dimensional features of samples,including bottom-up unsupervised learning and top-down supervised fine-tuning process.Finally,GBDT is used to construct an iterative decision tree.Through continuous learning of the former aggregate conclusion-residuals,the final output of learning classification results is obtained.Based on NSLKDD,a classical intrusion detection data set,Remote to Local(R2L)and User to Root(U2R)are extracted from the data set as minority attacks,and the maximum dissimilarity coefficient density SMOTE oversampling is conducted for the two classes samples to balance the sample distribution.Finally,DBN and GBDT output the final training classification results.Precision,Recall,f-measure,Classification Error,Missing Alarm,and Positive Rate are selected as the evaluation criteria of intrusion detection.The classical oversampling algorithms,including random oversampling,SMOTE,borderline-smote,Adaptive Synthetic Sampling(ADASYN),and related algorithms are selected for comparing the effect of oversampling and learning classification.Experimental results show that the proposed method can effectively improve the classification effect of R2 L and U2 R,while meeting the detection rate of most categories.It also solves the problem of non-balanced sample classification in intrusion detection system and effectively improves the detection performance of the system.This dissertation has 9 figures,12 tables and 71 reference...
Keywords/Search Tags:intrusion detection, imbalance classification, maximum dissimilarity coefficient, SMOTE, DBN, GBDT
PDF Full Text Request
Related items