Font Size: a A A

Distributed Online Traffic Classification Based On Semi-supervised Learning

Posted on:2014-09-06Degree:MasterType:Thesis
Country:ChinaCandidate:X M YuFull Text:PDF
GTID:2268330425481065Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Along with the development of Internet, the scale of network becomes larger and largerand there are various novel applications occurred such as P2P and IPTV and so on. On onehand, the express increases of Internet traffic produced by these novel applications cause theserious burden in bandwidth and aggravate the situation of congestion in Internet. On theother hand, malicious traffic based on P2P also occurs in Internet frequently. It not onlyaccelerates consume in bandwidth, but also bring great challenge to network security. Withthe increase of Internet applications and Internet rate, the network management equipmentfaces greater and greater pressure in the nodes of Internet. It has become one of the importantissues in network management, how to provide effectual technology to identify and monitorInternet traffic for managing and controlling various malicious traffic and providingreasonable bandwidth resource and service.Traffic classification is the key and foundation to solve above problems, especiallyclassification methodology based on semi-supervised learning theory. The semi-supervisedlearning theory become the hot spot of research in the area of traffic classification, which notonly promote the performance of classifier by using a small quantity of labeling data, but alsohas the ability of finding novel communication pattern. In consideration of naturalregionalism and time-domain character, the philosophy and technology of distributed dataprocessing are introduced into online traffic classification. It decentralizes trafficclassification task into various nodes. The central node manages and harmonizes the work ofeach child node. It also validates the classification result on child node.Firstly, a kind of semi-supervised cluster algorithm is proposed for traffic classification,which aims at the problem that a majority of classifiers based on supervised learningalgorithms depend on labeled data overly. This algorithm uses samples with accurateinformation of application category as initial center of clustering, which not only effectuallyidentify various applications, but also discover novel applications.Secondly, because of real time, polytropism, transient and other character of Internettraffic, it is very meaning to research methodology of online traffic classification. Beingaimed specially at real time traffic classification, a kind of online traffic classification model is proposed, which trains offline and classify traffic online for improving classificationefficiency. It uses semi-supervised cluster algorithm proposed in this paper to validate theclassification results in real time. That will timely update the online traffic classifier andguarantee the model has good adaptability to network environment.Lastly, the distributed online traffic classification methodology is investigated for solvingthe problem that it is difficult to classify traffic in high speed network. This methodologycombines the philosophy of distributed data processing into online traffic classification in thefoundation of research on supervised learning algorithms and online traffic classification. Thisdistributed traffic classification methodology decentralizes the real time traffic classificationinto each node instead of centralized classification and focus on the synergistically classifytraffic online.
Keywords/Search Tags:traffic classification, semi-supervised learning, distribution, online trafficclassification
PDF Full Text Request
Related items