Font Size: a A A

Research On Flow Monitoring Technologies In Internet Backbone Networks

Posted on:2011-08-28Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y ZhangFull Text:PDF
GTID:1118330338489411Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
Recently, the Internet is being attacked frequently which results in the large-scalenetwork paralysis. With the development of our national economy and the rise in thenetwork information level, the network security incidents such as large-scale network at-tacks and network economic crimes emerge one after another. Therefore, we need tomonitor the large-scale network security incidents. In the backbone network monitor-ing, packet-level measurement faces with the massive data to be processed, the storagecapability, processing capacity and transmission capability of the monitoring system aregreatly challenged. The ?ow-level measurement offers a new way for the backbone net-work monitoring and provides a reasonable tradeoff between the volume of informationand its level of detail. Currently, the ?ow monitoring mainly includes identifying ?ows ofparticular attributes (i.e., large ?ows and high ?ows) and ?ow-based network monitoringsystem and port scan detection.The main contents of the dissertation are as follows:(1) Large ?ow identification: by defining the large ?ows with relative size, i.e., fora given threshold ? (0 < ? < 1), all the ?ows whose share exceeds a fraction ? of thetraffic actually observed are considered to be large ?ows, the problem of large ?ow iden-tification in high-speed network monitoring is equivalent to the problem of frequent itemidentification in a weighted data stream. The per-packet processing on current backbonelinks needs to be accomplished within time scales of a few nanoseconds, furthermore, thelarge ?ow identification algorithms face increasingly harder real-time constraints as thespeed of the backbone links increases. However, so far as we know, there are no identify-ing algorithms which can provide constant update time (O(1)) in a weighted data stream.We present an algorithm named Weighted Lossy Counting (WLC) which is able to iden-tify large ?ows in a high-speed weighted data stream with constant update time. WLCemploys a novel efficient partially ordered data structure which is able to provide a fastper-item update speed while keeping the memory cost relatively low. We compare WLCwith state-of-the-art algorithms for finding large ?ows in real traffic traces. The experi-mental results show that WLC performs well in accuracy (recall, precision and averagerelative error) as other algorithms; moreover it has a much higher update speed at the cost of relatively larger memory space used. A theoretical worst-case memory bound of WLCis also derived in this dissertation; however, experiments with long real traffic traces showthat WLC requires much less space than the theoretical bound in practice.(2) Parallel and distributed large ?ow identification: the advent of multi-core pro-cessors calls for efficient parallel designs which can effectively utilize the parallelism ofthe multi-cores. In this dissertation, we address the problem of parallelizing weightedfrequency counting in the context of multi-core processors. We discuss the challenges indesigning an efficient parallel system. Our evaluation and analysis reveal that the naivefine-grained lock design results in excessive overhead and wait, which in turn leads tosevere performance degradation in multi-core architectures. Based on our analysis, wepropose a novel method: precision integrated method (PRIM). PRIM makes use of thetemporal imprecision concept to significantly reduce the merge overhead at the cost ofrelatively large memory space used. Both the theoretical analysis and real traffic experi-ments demonstrate that PRIM delivers almost linear speedup. At last, we demonstrate theapplication of PRIM in distributed large ?ow identification.(3) High-rate ?ow identification: the current high-rate ?ow identification algorithmscan not specify the identification accuracy and have to maintain every sampled ?ow in the?ow table which results in excessive memory cost. Therefore a novel high-rate ?ow iden-tification method based on truncated sequential probability ratio test (TSPRT) is proposed.Through sequential sampling, TSPRT is able to remove the low-rate ?ows and identifythe high-rate ?ows at the early stage which can reduce the memory cost and identifica-tion time respectively. According to the way to determine the parameters in TSPRT, twoversions of TSPRT are proposed: TSPRT-M which is suitable when low memory costis preferred and TSPRT-T which is suitable when short identification time is preferred.The experimental results show that TSPRT requires less memory and identification timein identifying high-rate ?ows while satisfying the accuracy requirement as compared topreviously proposed methods.(4) Flow-based network monitoring system and port scan detection: there have beenmany ?ow monitoring systems, some of them only support mass storage, some of themonly support real-time anomaly detection, there are no system which can support bothmass storage and real-time anomaly detection. We design and implement a ?ow monitor-ing prototype system which not only supports mass storage, but also supports real-time anomaly detection. Currently, there have been many ?ow-based anomaly detection algo-rithms, including DDoS detection algorithms, worm detection algorithms, etc. Amongthem, the ?ow-based port scan detection algorithms suffer from high false positive rateand high false negative rate, therefore we propose a new ?ow-based port scan detectionalgorithm: TFDS (time based ?ow size distribution sequential hypothesis testing). Theintuition behind TFDS is that the FSD (?ow size distribution) entropy of a scanner tendsto be very small comparing to benign hosts. As a result, TFDS combines the FSD en-tropy metric and the main idea of sequential hypothesis testing to identify scanners. Theexperimental results show that TFDS improves the detection performance as compared topreviously proposed methods.
Keywords/Search Tags:network traffic monitoring, large flow, weighted data stream, parallel fre-quent item mining, high-rate flow, flow-based port scan detection
PDF Full Text Request
Related items