Font Size: a A A

Research On Network Traffic Identification Method Based On Machine Learning

Posted on:2017-10-16Degree:MasterType:Thesis
Country:ChinaCandidate:Y ZhangFull Text:PDF
GTID:2348330512955426Subject:Communication and Information System
Abstract/Summary:
With the rapid development of Internet technology,the explosive growth of network traffic and the flexible use of network protocols have led to more and more network threats.These threats can be well solved through the network traffic identification.Facing the increasingly complex network environment,it is more and more important to deal with the identification methods of modern network traffic such as high-dimension and multi-type.The traffic identification algorithm based on machine learning is the focus of the research among experts and scholars in recent years.The process of network traffic identification mainly includes feature processing and traffic identification.In the current feature processing methods,the redundant and irrelevant features of the feature set can not be removed simultaneously.To solve this problem,a KL-RF algorithm based on K-L transform and ReliefF feature selection is proposed.The algorithm uses K-L transform to remove the redundant features of the original feature set,and adaptive method is used to adjust the feature weight threshold value of the ReliefF algorithm,so as to remove the irrelevant features,quality feature subset can also be obtained,which can reduce the complexity of traffic identification and the time of training modeling,and improve the efficiency of the operation.In traffic identification,the AdaBoost-SVM algorithm based on machine learning has the weight imbalance problem caused by multiple errors.To solve this problem,an improved AdaBoost-SVM algorithm is proposed.This algorithm selects a reasonable weight calculation method of base identifier by adjusting the error distribution of various samples,in order to prevent the phenomenon of the imbalance of the sample weight in the training process,the accuracy of traffic identification is also improved.Finally,the Andrew W.Moore data set is used to verify the KL-RF algorithm and the improved AdaBoost-SVM algorithm.The experimental results show that compared with the original algorithms,the algorithms proposed in this paper reduce the dimension of the feature subset,the time to build the identification model is shortened,and the accuracy of traffic identification is also improved.
Keywords/Search Tags:Feature Process, K-L Transform, ReliefF, Traffic Identification, SVM, AdaBoost
Related items