Font Size: a A A

Research On Optimization And Ensemble Of Network Traffic Classifier

Posted on:2020-10-07Degree:MasterType:Thesis
Country:ChinaCandidate:X Y DiaoFull Text:PDF
GTID:2428330602961451Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Network Traffic Classification is both a key technologies for guaranteeing network service quality and an important method for traffic cleaning and network attack or malicious code detection.At present,typical traffic classification methods mainly include a regular classification method based on port numbers of communication parties,a content interpretation classification method based on Deep Packet Inspection(DPI),and a statistical classification method based on machine learning.In practice,the use of network dynamic ports leads to a decrease in the accuracy of the port number classification method,and the popularity of information security technology causes the deep packet-based detection method to be inoperable.Therefore,machine learning algorithms have attracted more and more researchers' attention in the past few years.Different machine learning algorithms are widely used in the field of network traffic classification.At present,the following problems exist in the study of machine learning algorithms applied to network traffic classification:(1)The feature selection in the process of generating the decision tree model often only considers the case where the sample subset is locally optimal under the single feature partition,and there is no branching in the global perspective.(2)Network datasets usually exhibit extremely uneven distribution,but current research rarely examines the accuracy of the model in the uneven distribution of samples,that causes the method has an offset,and the correct recognition rate of the sample in the data set is inferior.(3)There are many samples at the distribution edge in the flow sample.There is no flow classification study for these abnormal samples,and these samples will increase the over-fitting risk of the model.In response to the above problems,this paper focuses on the following aspects:(1)Optimize the feature extraction process,and use the correlation between the features in the data ensemble to add global information in the local feature extraction process,so that the model can better extract the global features of the data set.(2)For the problem of uneven distribution of data sets,by adjusting the ensemble structure of the algorithm,a new ensemble structure based on sample prioritization is proposed,which gives full play to the high accuracy performance advantages of the two classifiers and the full classifier.Adaptive capabilities in the classification type.(3)A pruning method for extracting abnormal samples of decision trees is proposed.The clustering idea is used to judge the generalization performance of leaf nodes,and the abnormal samples are separately classified to further improve the accuracy of the model.
Keywords/Search Tags:network traffic classification, decision tree, feature extraction, pruning, ensemble model
PDF Full Text Request
Related items