Font Size: a A A

Research On High-Performance Internet Flow Identification Algorithms

Posted on:2020-09-01Degree:DoctorType:Dissertation
Country:ChinaCandidate:X G ZhangFull Text:PDF
GTID:1368330611455432Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet,sharp increase of network bandwidth and rapid growth in network users,all kinds of network applications spring up and network behaviors become increasingly complicated.Network traffic analysis based on packet has not met the needs of network management;instead,network flow technology is widely applied.Network flow identification is the basic step and bottleneck of network flow technology.Although some network flow identification strategies have been widely used in network security management,performance management,billing management,traffic classification,software defined network,traffic matrix estimation,load balancing management and complex network structure analysis;there is still room for improvement in their efficiency or accuracy.Therefore,this dissertation aims to improve the accuracy and efficiency of flow identification and provides some more applicable algorithms of network flow identification for network flow analysis technology.This dissertation focuses on network flow identification.Firstly,taking network flow attributes as the research objects,this dissertation compares the attribute characteristics of network flows among different networks in a region and the differences and commonness of network flow attribute characteristics between different regions,and provides the basis and reference for the researches of new network flow identification algorithms.Then,the rationality of timeout strategy is quantitatively analyzed based on the characteristics of network flow attributes,and a timeout threshold selection algorithm is proposed,which aims to ensure the integrity of the identified network flows.And this algorithm can provide some reasonable timeout thresholds for high precision network flow identification.Furthermore,because of the high percentage of single-packet flows,this dissertation focuses on the optimization of single-packet flow identification.Based on the effectiveness analysis of TCP connection and the high discrimination attributes of network flows,a fast filtering algorithm for TCP single-packet flows is proposed,which improves the efficiency of the network flow identification.Finally,for TCP network traffics,the contributions of TCP network flow attributes to network flows identification are quantified.And in order to improve the performance of network flow identification,a new flow identification algorithm is provided for TCP flows based on the finite state automaton which is designed based on transmission control protocol.The main contributions and innovations of this dissertation are as follows.(1)Taking the network flow attributes as the research objects,the real IP traces data of a region and between different regions are used.Based on the identification of massive network flows and the extraction of network flow attributes,the variation trend of each attribute of network flows in the time dimension is studied,and the attributes of network flows among different regions are compared and analyzed.Through the study of network flow characteristics,on the one hand,related researchers can more comprehensively grasp the characteristics of network flows attributes in time and region at present;on the other hand,it can also provide useful data references and breakthroughs for the researches of more accurate and efficient network flow identification algorithms.After that,we firstly analyze and discuss the feasibility and rationality of the timeout strategy for network flow identification by using attribute recognition degree.Then,from the perspective of improving the accuracy of network flow identification,a novel timeout threshold selection algorithm of network flow identification is proposed,which aims to ensure the integrity of network flows,and a reasonable empirical timeout threshold is obtained by experiments based on a large number of measured IP trace data.(2)In view of the existence of a large number of single-packet flows and inadequacy in the optimization of single-packet flows identification,the characteristics of TCP single-packet flows are studied in depth.It is found that the TCP packet state,the TCP packet arrival interval and the TCP packet size are highly correlated with the TCP single-packet flows.Based on the high discrimination attributes,a fast identification algorithm is proposed for TCP single-packet flows.This algorithm is efficient and relatively simple to implement.As a lightweight algorithm,this algorithm can be used as a front-end filtering mechanism for network flow identification,so as to quickly identify the TCP single-packet flows.Because the identified single-packet flows no longer occupy memory and consume computing resources for a long time,the performance of network flow identification is improved effectively.(3)For TCP traffics,which dominate network traffics,a concept of attribute recognition degree is proposed based on information entropy and it is used to quantify the contributions of TCP network flow attributes for network flow identification.We find that the TCP packet arrival interval has the highest recognition degree and that the TCP packet state is followed.Meanwhile,the TCP packet size also has a recognition degree,but its recognition degree is very low.Furthermore,the recognition degree of the time to live,source port,destination port and type of service are tiny and can be treated as nothing to do with flow identification.In view of the high recognition degree of the packet arrival interval and TCP packet state and in the light of the strict norms of transmission control protocol in connection establishment,data transmission and connection release,a bidirectional flow automaton is constructed by using the finite state automaton principle.An identification algorithm for TCP network flows based on the automaton is proposed,which identifies TCP network flows using the bidirectional flow automaton.This algorithm is a dedicated identification algorithm for TCP network flows and it is superior to the classical network flow identification algorithm and the existing representative network flow identification algorithms.
Keywords/Search Tags:network flow, flow characteristic, flow filtration, flow identification, finite state automaton, attribute recognition degree, flow timeout
PDF Full Text Request
Related items