Font Size: a A A

Research On The Key Technologies Of Network Traffic Classification

Posted on:2012-01-05Degree:DoctorType:Dissertation
Country:ChinaCandidate:G Z LinFull Text:PDF
GTID:1488303356472024Subject:Information security
Abstract/Summary:PDF Full Text Request
With the constant development of network infrastructure and network terminal technology, the number of netizen and network application in China is increasing raplidly. Informatization construction profits from the government support in the third generation of mobile technology, triple play and networking. All of these efforts accelerate industry convergence and enrich people's spiritual and cultural life. However, administrative department faces new problems, such as the demand of different quality of service by Internet Service Providers, the threat of network and information security, and the necessity of sex and violent content regulation. The ability to classify network traffic accurately and efficiently is the key point to all of these questions.In this paper, the advantages and drawbacks of main network traffic classification algorithms are concluded. In order to face the challenges from the anti-identification technologies, including using random port or encryption in transmit, network traffic classification algorithm based on semi-supervised clustering, automatically signatures mining algorithm and network traffic management system are researched in this paper. The main contributions of this paper are as follow:(1) The approach based on payload signatures presents more accurately and efficiently than other algorithms in network traffic classification. The performance of payload-based approach heavily depends on abundant and real-time signatures database. Existing approach used to dig out payload signatures involves a manual process which is time-consulting and complicated. In this paper, a novel payload signatures mining algorithm based on PrefixSpan is proposed to automatically extract signatures from special network application traffic. The mining process with continuous sequential pattern restriction and offset constraint in payload significantly reduces the size of final signatures database. The algorithm mines the complete set of signatures with offset constraint and outperforms Apriori-based algorithm. Moreover, the experimental results show high precision and low error rate using these mined features in network traffic classification.(2) The diminished accuracy of port-based classification and the incapability in unknown traffic indentification of payload-based classification motivate the use of transport layer statistics for network traffic classification. The approaches based on semi-supervised clustering can identify unknown network traffic and map unlabeled clusters to network applications easily. A novel semi-supervised clustering approach based on improved K-Means clustering algorithm is proposed in this paper to partition a training network flows set that contains a huge number of unlabeled flows and scarce labeled flows. Greedy algorithm and labeled flows are used to initialize clusters centers instead of the random selection of the cluster centers. Maximum likelihood estimation is selected to construct a mapping from the clusters to the predefined traffic classes set. The experimental results show that both the overall accuracy and SSE value of our algorithm present better than those based on normal K-Means algorithm.(3) Only one network traffic classification approach is employed in almost every existing network traffic management system. There is a contradiction between unknown traffic identification and the accuracy of classification results. A novel framework using both payload-based algorithm and machine learning algorithm is constructed in this paper. The results generated by each algorithm will be estimated centralized under special standard. Meanwhile, modules for automatically signatures mining and self-learning in machine learning algorithm are adopted in the framework in order to update the system timely. The framework also supports various network traffic control methods and can be deployed in path or bypass pattern. (4) Network speed restriction is the chief demand for network management. After identifying network traffic, network management system in bypass pattern can send manual constructed packets or notice other network security systems to manage special network flows. However, these methods are limited because of the complexity or effect. A novel approach is proposed in this paper using sliding window field of TCP protocol to restrict network speed in bypass deployment pattern. Packets with constructed sliding window field are sent to control network flows speed in byte unit. In addition, performance evaluation model for network traffic classification is constructed in this paper. Formulas in byte unit and flow unit can be adopted to calculate data throughput of network management system in performance evaluation, therefore redundancy will be reduced.
Keywords/Search Tags:network traffic classification, sequential pattern mining, semi-supervised clustering, network management system, performance evaluation model
PDF Full Text Request
Related items