Font Size: a A A

Feature Selection Study In The Internert Traffic Classification

Posted on:2014-04-25Degree:MasterType:Thesis
Country:ChinaCandidate:K LiuFull Text:PDF
GTID:2298330431478001Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the rapid development of the network technology, the scale of Internet expanding continuously and the new application of network sustained to growth. On the one hand, with the rapid growth of network traffic, increase the network congestion and lead to a decline in the quality of network service. On the other hand, a wide variety of network applications not only licks up more and more network resources but also poses a great threat to the network security. In order to have a better understanding of the network status, diagnosis network fault in time and provide a basis for the management and optimization of the network configuration. Network managers need effective method to monitor and control the network traffic and provide timely and accurately analysis for all kinds of businesses which the network carried on. However, rapid and accurate identification and classification of network traffic is the precondition and foundation of that wish.In recent years, traffic classification based on machine learning method has become new direction in network measurement. In this method, the key is to define and extract effective features which can distinguish between different types of traffic and select the appropriate machine learning methods. The features mentioned above include the packet level features and flow level features. Extract the features of which contain rich classification information can improve the classification accuracy. At the same time, the reduction of feature dimension can reduce the classifier modeling time and improve the classification speed. This paper carries out the feature selection research on artificial selection and feature selection algorithm.This paper analysis and select features manually on the base of Moore feature set. Different from other methods, in our study, features of Moore feature set divided into five categories according to the nature of each feature. Study the contributions of different feature categories to the Internet traffic classification. First, identify the key features category and thinning the category gradually. Then, we can distinguish which featuers have a higher discrimination in the Internet traffic classification. In the experiment, various machine learning methods are selected to compare, for the purpose of exclude the differences of machine learning methods in the classification. Finally, a set of features found to be having a good discriminability. Verified by experiments, these features can be used for the Internet traffic classification.On the basis of select and analysis features manually, an improved feature selection method have been proposed. The study of select and analysis features manually based on experience and a lot of experiments, which can not cover allthe features and the efficiency usually not high.Feature selection algorithms can select the features automatically and efficiently, at the same time, it can exclude the interference of human and cover good featueres as much as possible. In our study, we combined the traditional genetic algorithm with the existing research results and information gain measure, a feature selection method based on the genetic algorithm and the information gain have been proposed for the Internet traffic identification. Several experiments contrast other feature selection methods,the feature selection method proposed in our study has high accuracy while streamlining features which can be applied into large-scale network traffic classification.
Keywords/Search Tags:traffic classification, traffic features, feature selection, information gain, geneticalgorithm
PDF Full Text Request
Related items