Font Size: a A A

Studies Of Network Traffic Identification Based On Machine Learning

Posted on:2016-12-01Degree:MasterType:Thesis
Country:ChinaCandidate:Y YangFull Text:PDF
GTID:2308330464464985Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the expansion of network scale, network traffic and network threats are simultaneously increasing. To some extent, the problem can be solved by network traffic identification. Network traffic identification can distinguish between normal traffic and threats in order to prevent attacks. However, the network traffic identification should not only analyze massive amounts of data, but also achieve real-time monitoring without affecting network performance. This has a high requirement on the operation time and operation space.How to quickly and accurately identify the bad behavior in the massive network traffic is a main issue in this thesis. Faced with the increasingly complex network environment, this thesis focuses on network traffic identification based on machine learning(ML), and does some research on feature selection of network traffic feature extraction and parameter optimization to make the recognition algorithm faster and more accurate.The main researches of this thesis are:Feature selection based on consensus decision-making: In network traffic identification,too much features representing network traffic easily leads to low efficiency classification and generalization ability. Combined with the existing feature selection algorithm, a consensus method based on feature selection algorithm is proposed in this thesis. The algorithm not only provides a relatively perfect set of features, but also studies the effects on classification results of different numbers of features, in order to meet the actual situation. The simulation results show that the feature selection improves the detection efficiency of the system and reduces the computational complexity.Parameter optimization method based on improved artificial bee colony,and using it in support vector machine(SVM) optimization: Support vector machine algorithm is an effective method of predicting and classifying the sample data, and widely used in the field of network behavior classification. The algorithm parameters(mainly the penalty factor and kernel parameters) greatly influence the classification accuracy. Artificial bee colony is a new global random search method in the field of parameter optimization in recent years, with the advantages of simple calculation and less parameters set. This thesis proposes an improved Artificial Bee Colony(ABC) algorithm. The improved algorithm is aimed to overcome the shortcomings of standard ABC being easy to fall into local optimal solution, low precision and slow convergence. It is used for parameter optimization of Support Vector Machines(SVM). When the employed bees and onlookers update the food sources, the improved ABC algorithm uses the local search strategy based on the current optimal solution to improve the local search capability of bees, accelerate the convergence and get higher precision.Introducing chaotic sequence makes sources distributed more evenly and avoids falling into local optimal solution. Compared with similar algorithms, simulation results show that the improved ABC algorithm does a better job in search speed and precision.Finally, this thesis combines these improvements to do the recognition experiments on intrusion detection dataset KDDCUP99 in Matlab. The results of experiments show that this method can get higher recognition efficiency, while guarantee the accuracy.
Keywords/Search Tags:machine learning, network traffic identification, consensus decision-making, features selection, artificial bee colony
PDF Full Text Request
Related items