Font Size: a A A

Research On The Key Technology Of Network Traffic Measurement And Identification

Posted on:2016-07-05Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y HouFull Text:PDF
GTID:1108330482979098Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
Traffic measurement and identification is the basis of networks, especially in terms of management, operating, prioritization and security. They are essential and valuable for us to understand the rules and characteristics of the networks. With the great development of network technologies,the network users have ballooned, the data rate has been growing rapidly, and more diversified applications have been carried.Moreover, the traditional methods, which are based on port or payload identification,have been unable to identify the network traffic in the high-speed networks, because more diversified sevices have been carried and the steganography technology is widely used.Therefore, it is urgent to present the effective network traffic measurements and identification technologies,which can effectively meet the future challenges of network management.The networks have a great mount of concurrent flows and high-speed packets.Although algorithms based on simple flow characteristics can be used to process these flows online, they always suffer bad performance. Then researchers tend to apply more features fusion and more complex models,which lead to huge computation burden.Moreover, current traffic identification technologies don’t take the consideration of differentiated management demands from different applications,so application recognition under constraints is far from realization. In a word, how to balance recognition accuracy and efficiency in online network traffic identification,and how to realize priority based network application recall in network management are still key problems in network technology. This thesis originates from the major project "Unified Security Control Network for Triple Play" and the theme project "Cross-network Information Security" belonging to the National High-Tech Research and Development Program of China(863 Program).To meet the need of real-time identification and control,this thesis focuses on two cores of application identification, traffic measurement and recognition.Our research is carried out in four aspects, the main work and contributions of this thesis are outlined as follows:1.To solve the problem of huge cost caused by obtaining all the packets for traffic identification in high-speed network, we propose an early traffic sampling algorithm called SSCBF(Same Source and Combination Bloom Filter) from the perspective of packets reduction.As the number of sampled flows is much larger than the number of sampling flows, two Bloom Filters are designed for sampling judgment and packets counting. The hash functions of the two filters are same but the counters’widths are different. Theoretical analysis shows that the algorithm can achieve minimum false positives by adjusting the width ratio of the two filters.The experiments of complexity and false positives based on the real trace verify the effectiveness of the algorithm.Experimental results indicate that the false positive rate of SSCBF is lower than other algorithms with the same memory. With the same false positive rate, the memory fell by at least 33% compared with other algorithms.2.To solve the problem of abnormal end flows caused by identifying heavy-hitters with CBF (Counting Bloom Filter),we design a novel scheme based on self-adapting CTBF (Counting and Timeout Bloom Filter).CTBF is the combination of CBF and TBF (Timeout Bloom Filter).On one hand,CBF records the number of packets and judges heavy-hitters according to the counters.On the other hand,TBF stores the arrival time of the latest packets and clears CBF timely when a flow ends.The mechanism can solve the space congestion of Bloom Filter and identify heavy hitters when flows have abnormal end.The timeout can be adjusted dynamically according to the traffic arrival intensity and Bloom filter vector length. Experimental results conducted on the real network traces demonstrate that the accuracy of the self-adaptive timeout algorithm is better than the fixed timeout algorithm. And the algorithm is more accurate than the existing algorithms.3.To solve the problem that traditional traffic identification algorithm cannot meet the differentiated classification accuracy requirements,an algorithm based on priority constraints is proposed.The algorithm is composed of the construction of category information entropy based decision tree and the pruning with weighted PEP (Pessimistic Error Pruning).The final decision tree focuses on high-priority applications and improves their recall during classification.Experimental results show that the recognition results are consistent with the precedence constraints, and achieve relative balance between accuracy and efficiency. Compared with the standard C4.5 decision tree algorithm, the algorithm can get significantly higher recall rate for high-priority application and can meet the differentiated classification constraints.Although the overall accuracy rate of the algorithm is slightly lower, its F-measure result is almost the same with C4.5 algorithms.4.Aiming at the issue of how to improve the processing rate of online traffic identification, with a new perspective, and traffic reduction, a novel online traffic identification method based on flow set is presented.According to that the application of a set of triplets is always the same; traffic for identification can be drastically reduced.While only partial flows of a triplet set are identified and the application of the triplet set is voted.The relationship of identified flows quantity and error rate is derived through theoretical analysis.The performance and processing rate of the algorithm are verified by experiments.The results indicate that, with reasonable estimation of error rate threshold,the proposed method significantly improves the processing rate without accuracy degradation.
Keywords/Search Tags:Traffic identification, Traffic measurement, Traffic sampling, Bloom Filter, Counting Bloom Filter, Decision tree, Flow set
PDF Full Text Request
Related items