Font Size: a A A

Research On Traffic Classification Method Based On Graphlet Patterns

Posted on:2014-07-14Degree:MasterType:Thesis
Country:ChinaCandidate:L S ShiFull Text:PDF
GTID:2268330401466217Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
Classifying network traffic flows according to the applications is an important task,the result of classification can be used to plan and design effective network, and tomonitor the trends of the applications in operational networks. However, trafficclassification methods based on port-number have lower accuracy than ever before dueto the emergence of a variety of new applications. Despite the fact that methods basedon the analysis of packet-payload have high accuracy, but their complexity is too largeand involve user privacy. Methods based on machine learning have too many types, sothey are not easy to be selected and have larger complexity. Methods based on thecharacteristics of the transport layer usually proceed classification through the largenumber of parameters on the flow level, but didn’t analyze traffic in the perspective ofhost interaction, thus some useful information is ignored.The BLINC (Blind Classification) method which was proposed in ACMSIGCOMM by Thomas Karagiannis divide the network traffic by using the differenceof graphlet connection patterns of different network applications in the transport layer,and it obtained good results. Compared to other methods, the main characteristic of thismethod is that it does not use the payload contents of the packet and can reveal theessential characteristics of the application by analyzing the interaction of hosts, and thismethod has good scalability. The BLINC classifies80%-90%of the flows withaccuracies of90%-95%by adjusting a series of threshold parameters. However, theBLINC simply lists some of the graphlet patterns but the detailed analyzing andconstructing process of each pattern and the detailed algorithm process had not beengiven. In addition, the graphlets that BLINC uses are all taking source node as the core,and the classification accuracy is very high but the completeness is low. Aiming at theproblem above, the main work of this paper is as follows:Firstly, the host interaction behaviors of Attack, Web, Spam-Filter, Game, NM(Network Management), Chat, Mail, Streaming, FTP, DNS, P2P these application typesare detailedly analyzed one by one and their graphlet patterns that taking source node asthe core as well as respective characteristics are given, and some similar graphlet patterns are analyzed and differentiated through heuristics. Based on the BLINC method,a matching algorithm by the use of the graphlet patterns that taking source node as thecore is proposed, and the classification performance of four traditional applications Web,DNS, Mail, FTP, and a new application P2P are validated by the use of this algorithm ina large network. The experimental result shows that the classification accuracy is veryhigh but the completeness is low.Secondly, to solve the problem that the low completeness of the method based ongraphlet patterns that taking source node as the core, this paper proposes that thegraphlet patterns that taking destination node as the core and port analysis can be usedto supplement to the original classification algorithm. In the same way, by analyzing theworking principle of the above eleven applications, their graphlet patterns that takingdestination node as the core as well as their respective characteristics are summarized.Thirdly, a combined classification algorithm is proposed which consists of threeparts that is the graphlet patterns that taking source node as the core, the graphletpatterns that taking destination node as the core and port analysis. By using thecombined algorithm, The same set of Web, DNS, Mail, FTP, P2P traffic data are used toclassified and validated classification performance under the condition of the samethreshold values. The experimental result shows that the combined algorithm obviouslyobtaines higher completeness compared the original algorithm under the condition ofclassification accuracy is almost unchanged.Lastly, a traffic classification software based on graphlet patterns is implemented.The software consists of data input and output modules, classification algorithmmodules as well as the GUI display and interacting module three parts, and the friendlyinterface and interaction are provided, thus the user is greatly facilitated to analyze andclassify traffic data.
Keywords/Search Tags:traffic classification, BLINC, graphlet patterns, matching
PDF Full Text Request
Related items