Font Size: a A A

DDoS Attack Detection Based On Machine Learning And Real-time Computation And Analysis Of Big Data

Posted on:2020-11-15Degree:MasterType:Thesis
Country:ChinaCandidate:J T PeiFull Text:PDF
GTID:2428330623456658Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Distributed denial of service attacks,also known as DDoS attacks,is one of the most common types of network attacks.With the rapid development of computer and communication technology,the harm caused by DDoS attacks is becoming more and more serious.Therefore,related research on DDoS attack detection is more important.Nowadays,there have been some related research work and some progress.However,due to the variety of DDoS attacks and the varying sizes of attack traffic,a detection method with high detection accuracy has not yet appeared.In view of this,under the premise that the inquiry has learned a lot of relevant research at home and abroad,this paper proposes a DDoS attack detection method based on machine learning and big data real-time computing analysis.The random forest algorithm in machine learning is an excellent classification regression algorithm.This thesis finds three defects by analyzing the classification model based on random forest algorithm,and three improvements are proposed for these three shortcomings.The first point is that the classification effect of the machine learning model based on the random forest classification algorithm is closely related to the number of randomly selected features in constructing each decision tree.Most of the classification models based on random forest algorithms have no logical operations on this number,and some conventional values are proposed.Insufficient for this defect,this thesis proposes a forward search K-fold cross-validation method to calculate the number of randomly selected features.It can make the tree similarity and classification performance in the process of constructing random forest classification model to achieve the best effect.The second point is that when the traditional random forest algorithm is applied in the process of classification problem,since several commonly used voting combination strategies have strong correlations when encountering random forest algorithm decision tree models,overall classification model classification effect accuracy is low.To solve this problem,this thesis proposes an CA-LR learning method.The CA-LR learning method optimizes the traditional random forest algorithm by solving the problem of low error rate of the overall classification model when the correlation between arbitrary decision trees is strong.The third point is that in the traditional random forest algorithm,usually,the traditional evaluation indicators for the integrated learning model mostly focus on the classification of the overall classification model,and ignore the classification effect of one of the models.In view of this,this thesis proposes a new ROC-AUC-PR evaluation index,the evaluation indicator first evaluates a single model in the model,update and optimize the preliminary classification model,then evaluate the overall classification effect of the model.The DDoS attack principle is to send large-scale data packets to the target.In order to handle large-scale data packets,the traditional DDoS attack detection is mainly based on the MR distributed computing analysis framework.But the MR distributed computing analysis framework is based on offline computing,which leads to a large delay in the computational analysis and detection based on the MR distributed computing analysis framework.In order to detect DDoS attacks faster and more timely,this thesis proposes a real-time computing analysis framework based on SparkStreaming-kafka.Experiments show that the proposed improved random forest algorithm and DDoS attack detection combined with real-time computational analysis of big data have higher real-time and accuracy for the current popular TCP flood,UDP flood and ICMP flood attacks.
Keywords/Search Tags:DDoS attack detection, Machine learning, Random forest, Big Data, Real-time calculation analysis
PDF Full Text Request
Related items