Font Size: a A A

Research On Techniques Of Collecting Internet Traffic Ground Truth

Posted on:2012-03-08Degree:MasterType:Thesis
Country:ChinaCandidate:S J HuangFull Text:PDF
GTID:2218330362959381Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
How to collect Internet traffic ground truth accurately and efficiently is a most important topic and a great challenge in related fields. With this, the government, Internet Service Providers (ISP), network administrators and network data analysts can keep traffic managed and maintain network security with low cost and high efficiency. Classic methods based on ports and DPI techniques drop a lot in classification coverage rate, with the P2P application increase and its low anti-encryption ability. Howerver, methods based on DFI or flow topology information need input of pure ground-truth traffic.Now that it is believed that DPI techniques are of the most general and reliable except for manual classification. This paper researches and discusses against the problem how to improve the classification coverage of DPI system. DFI module is introduced to DPI system, and supervised machine learning is carried out inside to accomplish the classification upon unknown traffic a second time. Results show that this mixture classifier model succeeds in overcoming the defects of DPI system.Decrease of classification granularity is introduced when combining DFI techniques with DPI system. This paper creatively proposes a heuristic classification algorithm, which helps an unknown flow to be classified to one of those resolved protocols, just before the actual DPI process starts. The concrete steps of implementing light-weighted heuristic classification algorithm with Wireshark software are introduced at last. By doing this, the classification granularity and the coverage rate can be balanced.The current research achievements largely enrich research ideas in collecting traffic ground truth, which have a certain contribution both in theoretical significance and application value.
Keywords/Search Tags:traffic classification, DPI, DFI, machine learning, heuristic classification algorithm
PDF Full Text Request
Related items