Font Size: a A A

Research On Network Security Based On Imbalanced Data Classification

Posted on:2019-03-12Degree:MasterType:Thesis
Country:ChinaCandidate:Z JiaFull Text:PDF
GTID:2428330593951117Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
With the continuous growth of network size and complexity,the corresponding network security problems also face enormous challenges.Currently,using the massive data,which is produce by IDS,firewall,Netflow and other security devices,for further study,is an important means of analyzing the state of network security.However,the existing network security analysis systems still have some defects in the capability of classifying unbalanced data.Although the existing classification algorithms have been greatly improved in accuracy,but there is still uncertainty and delay.On the one hand,there is no special consideration for the classification accuracy of minority class data,which leads to the wrong identification of anomaly data,which leads to security incidents.On the other hand,the serious delay in the detection of emerging anomalous data type cause underreporting and false positives,which is likely to have catastrophic consequences for the network system,resulting in huge losses.In order to solve the above problems,a combined classifier algorithm based on autonomous learning is presented.First,valuable data is selected from a large number of unlabeled network security data by means of autonomous learning and marked,and the labeled data is used as a training set of the combined classifier.Then,the AdaBoost algorithm based on C4.5 decision tree is used in the combinatorial classifier,and the misclassification cost and penalty function are introduced in the training process,which increases the diversity of classification process and achieves higher classification accuracy.The performance of security data classifier is improved.Finally,the proposed method is verified by simulation experiments and compared with similar algorithms.Experiments using KDD Cup 99 dataset and the actual network capture data packets as training and testing sets,which reflects the real network environment better.After preprocessing and feature extraction,the data sets are tested for the recall rate,accuracy,F metric value and new type anomaly data.The results show that compared with the traditional AdaBoost algorithm,the proposed algorithm performs well in classification accuracy and improves detection performance for minority classes.Finally,the algorithm is applied to the network backtracking analysis system to achieve a more comprehensive and in-depth traffic analysis,business analysis and network failure discrimination.
Keywords/Search Tags:Network security, Anomaly classification, Imbalanced data, Combined classifier, AdaBoost algorithm
PDF Full Text Request
Related items