Font Size: a A A

A Method For Estimation Of Flow Length Distributions From Double Sampled Flow Statistics

Posted on:2010-12-12Degree:MasterType:Thesis
Country:ChinaCandidate:L LiFull Text:PDF
GTID:2178360275453632Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Network traffic monitoring and analysis is crucial for many network applications, such as network managements, network planning, and network security applications. With the rapid growth of link speed in recent year, it is very difficult to measure all of the information when facing large number of flow sources. How to accurately and efficiently monitor the large volume of network flows becomes a research hot-spot.In order to solve the problems in the network measurement, the sampling measurement is proposed, i.e. sampling measurement is used to estimate the original flow information from the statistical point of view. Sampling measurement technology is classified into flow-based sampling and packet-based sampling. The flow-based sampling have high accuracy, but the resources consumed is in large quantities; the packet-based sampling is scalability and expansively, but its estimating accuracy is low.In order to overcome aforementioned shortcomings of packet sampling and flow sampling, we propose a sampling strategy called double sampling which consists of flow sampling and packet sampling. Two sampling rules are included in the above work: flow sampling, where entire flows of packets are retained or discarded at once, and packet sampling, which acts directly on individual packets and is ignorant of flows. Using real network traffic traces, we show that the proposed double sampling technique indeed decreases the usage of the system sources and lessen the information.Knowing the number and length of the unsampled flows remains useful characterizing traffic and resources required to accommodate its demands. This paper investigates the methods that use flow statistics forms from sampled packet stream to infer the lengths of flows and their frequencies in the unsampled stream. Some methods for estimation of original flow length distribution from sampling flow statistics are discussed to obtain the distribution feature of unsampled flows. Sampling entails an inherent loss of information. First of all, we analyse the information loss in double sampling method. In the packet sampling process, the information loss is the total numbers of flows decreased and the number of packets of flows lessened by the packet sampling. In the flow sampling process, the information loss is only the total numbers of flows decreased by flow sampling. The network flow is divided into long flows and short ones, according to the probability of flow unsampled. In estimation of packet sampling, the scaling method is used in estimating the distribution of long flow, and the expectation maximization (EM) algorithm is used in estimating the distribution of short flow. In estimation of flow sampling, we also use the EM algorithm to estimate the distribution of original flow on the known of distributions of sampling flow which is estimated before. The experiment results demonstrate that the inferred distributions are efficacious.
Keywords/Search Tags:flow sampling, packet sampling, double sampling, flow length, EM algorithm
PDF Full Text Request
Related items