Font Size: a A A

Internet Traffic Classification And Identification By Using Support Vector Machines

Posted on:2013-10-25Degree:MasterType:Thesis
Country:ChinaCandidate:J N LiuFull Text:PDF
GTID:2248330371983035Subject:Network and information security
Abstract/Summary:PDF Full Text Request
Classifying and identifying the network traffic is important to improve the networkquality-of-service, optimize the network performance and avoid virus attack. With thedevelopment of network, How to classify network traffic accurately has become one of themost important issues. Port-based classification no longer guarantees the reliably accuracy.Using payload-content need to analyze the full flow payload, the overheads is unacceptable.To address this traditional question, we proposed a traffic classification and identificationbased on One Class Support Vector Machine which uses only positive labeled training dataset.The principle of Support Vector Machines which based on Structural Risk MinimizationPrinciple and Vapnik-Chervonenkis Dimension is to find the best separating hyperplanewhich guarantees the maximum margin to separate two classes. It converts the classificationproblem to a Quadratic Programming with Lagrange parameter and dual problem and mapsthe flow to a high dimensional space to make the training set linearly separable by using theKernel function. It is known that SVM is suitable for the small sample and the classificationproblem of high dimensional feature space. It is beneficial to use One Class Support vectorMachine against to the unbalance flow set, therefore, we build the traffic c lassification modelof OCSVM. This paper will use python and weka to deal with the data set, and also usematlab and libsvm as the development tools to implement the traffic classification model.For SVM, the classified performance is fully dependent on the values of parameter, so itis essential to optimize parameter selection. The F1-measure will be used as the criterion.Then, training the data to build module by the most optimal parameter.There are some innovations in this paper as is shown below:1. There are two experiment modules in my paper: one is with all kinds of flows, theother is only the kinds of DB、MAIL、SERV、WWW. Both of them would be high inprecision and low in false alarm rate.2. Because of the classification module is based on the One Class Support vectorMachine, the train set has only positive data, so the value of FP and TN will be0and theassessment will be laminated. This paper proposes one improved9fold cross validation algorithm, its dataset will not only contain the sample of its own type, but also contain thesamples in all the other types, so the assessment will be more reliable.3. The number of flow set used in this paper has248attributes, so either the process ofdata collecting or the process of classifying will be time-consumed, and the redundancy andnoise will affect the performance of classification to some extent. In the second module, weuse FCBF feature selection in classification to select20features from248of the flow,it cannot only clearly decrease the dimension of the flow and lower the time complexity, but alsoimprove the classification accuracy and stably.4. It is still less to research on the parameter optimization problem of OCSVM in the ITarea. In the whole process of optimizing parameters, this paper proposes three algorithms (thehigh resolution v-grid search, the particle swarm optimization with dynamic inertial factor w,the weight simulated annealing) to find the optimization parameter of the classification model,contrast to the default grid search algorithm, the genetic algorithm, the ant colony optim-ization algorithm, all of the algorithms have high F1-measure.According to the performance assessment, traffic classification module being proposedwill be more accurate and stable. It has great advantages in the unbalanced data sets and highdimensional attribute spaces.
Keywords/Search Tags:Traffic classification, Machine learning, Support vector machine, One Class Supportvector machine, Parameter selection
PDF Full Text Request
Related items