Font Size: a A A

Research On Encrypted Traffic Identification Algorithm Based On Machine Learning

Posted on:2021-04-01Degree:MasterType:Thesis
Country:ChinaCandidate:D Y WuFull Text:PDF
GTID:2428330614450025Subject:Cyberspace security
Abstract/Summary:PDF Full Text Request
With the continuous development of information technology,the network is full of various encrypted traffic,and in order to effectively identify the encrypted traffic of various applications to improve network management,improve network services,and ensure the security of the network environment,features extraction and application identification of encrypted traffic has become more and more important.Based on the machine learning algorithm,this paper studies the encrypted traffic identification algorithm.The main work of this paper is as follows:1)This paper first analyzes and summarizes the traditional traffic identification methods,compares the advantages and disadvantages of each method and the applicable scenarios.Then this paper analyzes the difficulties faced by the traditional methods in the current network environment where encrypted traffic explodes.Furthermore,this paper analyzes the advantages of the machine learning method compared with the traditional traffic identification method.2)This paper proposes an application-oriented encryption traffic identification algorithm based on Bagging,using data flow statistical features to classify the application types corresponding to encrypted traffic,and using Isolation Forest to remove noise samples in the feature data set to further improve the accuracy of the algorithm.Then,on the basis of the algorithm,further refine the application classification,try to identify the functional modules of the application,and propose a function-oriented encryption traffic identification algorithm.In the function recognition,because the statistical features of the data flow are difficult to cover all functions,this paper introduces the load feature as an auxiliary,and effectively improves the recognition accuracy of the algorithm.Finally,through experiments,the recognition effects of the two algorithms in application recognition and function recognition were tested,and both achieved high accuracy,precision and recall.3)On the basis of the above two algorithms,in order to improve the adaptability of the Bagging algorithm to large-scale data sets and the efficiency of the algorithm,a parallel optimization method based on Spark is proposed.From the two levels of data optimization and task optimization,the corresponding parallel optimization strategies are proposed in the paper.Experiments show that this Spark-based parallel optimization method effectively improves the recognition efficiency of the Bagging algorithm when processing large-scale data sets,and also makes the accuracy of the Bagging algorithm more stable.
Keywords/Search Tags:encrypted traffic recognition, feature extraction, machine learning, parallel optimization
PDF Full Text Request
Related items