DDoS Attack Detection Technology Research Based On Machine Learning And Statistical Analysis

Posted on:2018-05-19

Degree:Doctor

Type:Dissertation

Country:China

Candidate:B Jia

Full Text:PDF

GTID:1318330518995998

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

With rapid development of computer and communications technology,and the rise and flourish of Cloud Computing, Internet of Things, Mobile Internet, and Big Data in the Internet Plus Era, Distributed Denial of Service (DDoS) attacks have become one of the most unstable factors in the information and network environment. With prevalent botnet in recent years, the harm brought by DDoS attack is increasingly serious. Due to the great perniciousness and the wide range involved in every influential attack event, DDoS attack detection is always an important research topic in the field of information and network security. However, there are still some questions about the existing research work. For instance, 1) Some indicators (such as Detection Rate (DR)) can be guaranteed, but detection time is lost, huge resources are consumed; 2) DR, Accuracy, Precision, and False Positive Rate (FPR) in attack detection cannot be better balanced.Given this, the dissertation uses the theoretical methods and techniques of popular machine learning, data mining, and statistical analysis at present to detect real-timely, efficiently, and accurately large traffic DDoS attack on the Internet by the characteristics of attacks and extraction and analysis of different attributive character in the attack traffic fields. The main contributions and innovations of our dissertation are shown as follows.(1) To address the problem of attack detection to the large traffic in the era of big data, and to address especially the poor effect of real-time DDoS attack detection and so on, we researched and designed a Real-time Attack Detection (RTAD) method based on Multivariate Dimensionality Reduction Analysis (MDRA) algorithm on the basis of multivariate statistical analysis and correlation statistical analysis in statistical analysis,and Principal Component Analysis (PCA) in machine learning. The RTAD method is to detect large traffic DDoS attack in real-time by reducing dimensionality and eliminating the correlation of attributive character fields in network traffic. By preprocessing experimental data and verifying the experiment, we draw the conclusions as follows. The RTAD method is superior to attack detection method based on Multivariate Correlation Analysis (MCA) algorithm in Precision and True Negative Rate (TNR).Meanwhile our method has obvious advantages in CPU computing time and memory consumption.(2) To address the problems of unrealizable cooperative detection,poor scalability, and deployment difficulty in traditional DDoS attack centralized detection and quasi-distributed detection method, our dissertation researched a DDoS attack traffic Random Forest Distribution Detection (RFDD) model based on ensemble classifier. The kernel part of the model adopts the ensemble learning method that is applied widely in machine learning, and it is Random Forest method based on combined classifier. We combine Random Forest algorithm in ensemble learning with distributed parallel computing framework. In order to detect accurately DDoS attack, we lower the noise and eliminate the correlation in the different attribute fields of attack traffic. The RFDD model has good extensibility, and it can fit in with dynamic adjustment and deployment of anomaly monitoring in the network environment. By the experimental verification, we draw the conclusions as follows. The RFDD model used in our research is superior to Adaboost method in DR, Accuracy, Precision,and FPR. When the different threshold values are chosen, the RFDD model has relatively high stability in the above four indicators.(3) To address the problems of poor generalization ability and stability of DDoS attack detection model based on homogeneous classifier, our dissertation proposed a Heterogeneous Multi-classifier Ensemble Learning(HMEL) detection model based on Singular Value Decomposition (SVD)and Rotation Forest ensemble strategy. The model includes three primary modules, and they are Data Set Pretreatment Module, Heterogeneous Multi-classifier Detection Module, and Classification Result Acquisition Module. The HMEL model can eliminate redundancy and correlation in different attribute fields of network traffic. By theoretical analysis, we can get that HMEL model has stronger generalization ability and universality.By the experimental comparisons with homogeneous classification detectors formed by Random Forest, k-NN, and Bagging algorithms by itself based on SVD and un-SVD, the following conclusions can be shown that the HMEL model is superior to k-NN, and it is approximate to Random Forest and Bagging in TNR, Accuracy, and Precision. With choice in difference of threshold values, k-NN in TNR, Accuracy, and Precision is unstable. Therefore, the HMEL model not only has stronger detection ability, but also is stable.In conclusion, our dissertation makes a series of positive exploration and intensive study based on machine learning and statistical analysis. The three basic principles are removing redundancy, lowering noise, and eliminating the correlation in different attribute fields of network traffic.Our purposes are to solve real-time, distributed, and accurate detection for DDoS attack. And DDoS attack detection is achieved by heterogeneous ensemble classification detection model that has strong generalization ability, universality, and stability. Some notable experimental results are got.The dissertation makes some valuable work in order to promote further study of relevant theoretical methods and applications in various scenarios.

Keywords/Search Tags:

multivariate dimensionality reduction analysis, random forest, heterogeneous multi-classifier ensemble learning, real time, distributed

PDF Full Text Request

Related items

1	Semi-Supervised Dimensionality Reduction And Ensemble Learning For Multi-label Classification
2	Research On The Dimensionality Reduction And Classification Algorithms In Multi-label Learning
3	Research Of Method And Application On Dimensionality Reduction Of High Dimensional Data Based On Multivariate Chart
4	The Study Of Graph-based Semi-supervised Learning/Dimensionality Reduction Methods And Their Applications
5	Symbiotic Forest:A Lightweight Decision Tree Ensemble Method
6	Feature Dimensionality Reduction And One-class Classifier With Applications To Radar Target Recognition
7	Dimensionality Reduction and Learning on Networks
8	Fault Diagnosis Of Analog Circuits Based On MODWPT And Random Forest
9	Research On Multi-information Fusion Of Distributed Sensors In Two-phase Flow Based On Complex Network Theory
10	Research On Generalized Canonical Correlation Analysis Of Data Dimensionality Reduction