Font Size: a A A

A systematic approach to feature selection for encrypted network traffic classification

Posted on:2014-10-30Degree:M.A.ScType:Thesis
University:Royal Military College of Canada (Canada)Candidate:Semeniuk, Trevor JohnFull Text:PDF
GTID:2458390008460238Subject:Electrical engineering
Abstract/Summary:
Most organizations, including the Canadian Department of National Defence, allow encrypted traffic on their networks so employees can perform transactions such as personal banking. By allowing legitimate encrypted traffic, it has been shown that non-authorized or malicious traffic in disguise may also bypass security perimeters. Recent research has focused on developing faster and more accurate methods of detecting nonauthorised use by classifying this encrypted traffic and many successes have been demonstrated. Feature-based statistical classification has produced positive results when applied to encrypted traffic and various methods have been used to select the feature sets. However, a literature survey did not find evidence of a systematic approach to select and assess the predictive value of feature sets for use in encrypted traffic classification.;The objective of this research was to develop a general-purpose method of selecting feature subsets that result in high prediction accuracy when used for encrypted traffic classification. The methodology developed uses the fast orthogonal search (FOS) algorithm to select feature subsets with discriminative power. Success was defined in terms of the prediction accuracy of the subset of features selected by the FOS algorithm, as compared to subjectively selected features and features selected by the Best First algorithm. In all experiments the FOS algorithm achieved comparable or better classification results with substantially reduced feature subsets. In the final experiment the FOS algorithm selected a 12-feature subset from a set of 2,839 features. This subset achieved a receiver operating characteristic (ROC) area under the curve (AUC) of 0.9898 compared to a benchmark AUC of 0.9893 achieved using a 44-feature primary set. This translates to 106 fewer errors using a subset of 32 fewer features, and an 81% reduction in computation time for classification.
Keywords/Search Tags:Traffic, Encrypted, Feature, Classification, FOS algorithm, Select, Subset
Related items