Font Size: a A A

The Research Of P2P Traffic Identification Based On Improved K-means Clustering Algorithm

Posted on:2014-09-04Degree:MasterType:Thesis
Country:ChinaCandidate:W Y ZhaoFull Text:PDF
GTID:2268330401985531Subject:Detection Technology and Automation
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet technology, P2P technology has obtained a rapid development by sharing and other characteristics to enrich efficient resource and bring great convenience to people’s lives, but also brings network congestion, bandwidth consumption, etc. problem, operators had to add more bandwidth to ensure quality of service. With the growing number of P2P applications and network expansion, operators will fall into "congestion increase bandwidth-then congestion" cycle of death, not only can not guarantee the basic quality of service, and increased bandwidth costs. The fundamental way to solve the problem is to achieve fast and accurate identification of P2P traffic, according to the actual conditions on the reasonable control and optimization of traffic management.This article describes the structural characteristics of P2P technology, and the development of P2P traffic identification method Research and analysis of machine learning algorithms in P2P traffic identification, supervised machine learning algorithms, unsupervised machine learning algorithms and semi-the advantages and disadvantages of the supervised machine learning algorithms, the use of semi-supervised machine learning algorithms for P2P traffic identification. First of all, for the characteristics of P2P traffic redundant and irrelevant features, CFS algorithm for feature selection, the algorithm can not only guarantee the accuracy of classification and efficiently complete the feature selection, you can also remove the redundant features of the P2P traffic. Secondly, the k-means algorithm each iteration are required for all data distance calculation ball tree improvement, reducing the number of iterations to improve the efficiency of clustering; identify accurately the problem of low k-means algorithm, use a small amount of labeled samples as a guide for guiding the initial cluster centers, and then clustering based on improved k-means clustering algorithm, the use of weka3.7experimental verification, accurate and rapid identification of P2P traffic. Finally, using a two-dimensional dynamic flow control technology builds P2P traffic control system to achieve the effective control of BT and other P2P traffic.
Keywords/Search Tags:P2P traffic identification, k-means algorithm, semi-supervised, ball tree, two-dimensional flow control
PDF Full Text Request
Related items