Font Size: a A A

A Peer To Peer Traffic Identification Method Using Machine Learning

Posted on:2011-10-29Degree:MasterType:Thesis
Country:ChinaCandidate:H ZhuFull Text:PDF
GTID:2178360308461565Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
In recent years, rapid development of P2P services in promoting the development of the INTERNET has also brought many problems.(1) bandwidth problem:P2P services continue to increase, resulting in a huge network bandwidth consumption and even lead to network congestion,so that the network performance degradation, decline in quality of service; (2) on copyright issues:In the Internet age, digital content can be easily replication and transmission, in particular, the prosperity of P2P sharing software to accelerate the distribution of pirated media, an increase of intellectual property protection more difficult. Today in the prevalence of P2P sharing networks in the protection of intellectual property issues;(3) Network security issues:P2P idea is free, and fair sharing of resources. Accordingly, each node is independent, the constraints on the system to the nodes become very small.In the absence of control center systems, no one knows the other nodes to share what is this bad news for the spread of viruses and provided the conditions.With the increasing importance of the INTERNET and the increasing complexity of network structure, network security, manageability and availability of legacy applications has been challenged, it is clear more and more aware of the need for P2P traffic and network behavior in-depth understanding of analysis, monitoring and management of P2P to provide technical support, thus, effectively identify the P2P traffic has become an urgent problem, how to effectively identify the P2P services to become a hot topic researchThe Dissertation is from one of the research tasks of the National Natural Science Foundation (No.60672025)funded projects,in the course of the study subjects, the authors do the following work:1.The Article start with the working principle of P2P traffic identification, summed up the existing P2P traffic identification technology, including port-based identification,signature recognition based on application layer, transport layer characteristics identification based on three categories, analysis of their advantages and disadvantages and proposed Data mining technology used in P2P traffic identification, and will identify the concept of P2P traffic deep into the identification of specific P2P services.2. According to P2P networks, the basic characteristics of the network nodes, combined with machine learning clustering and classification algotithm methods to select upstream and downstream traffic flow as the most important characteristics of ratio of the value of collecting large amounts of data through the establishment of training set and test sets, designed to identify the P2P real-time business modules and systems,the entire system including learning and identification of two phases.3.Coding System, P2P services identified through the test system accuracy and CPU share is proved that the method has high accuracy and low complexity.Papers were divided into six chapters.The first chapter give a brief introduction of the concept of P2P, data mining technology development, as well as the status of P2P and made the research direction.The second chapter describes the P2P and C/S model and the resulting relationship led to identification of P2P-related business principles and methods of analysis of P2P traffic identification technology trends.Chapterâ…¢deals with data mining and machine learning principles and steps.The fourth chapter describes how to use machine learning methods to identify the P2P service,and traffic data were extracted and analyzed, gives concrete steps to establish a correlation model. Chapter V achieved P2P software business views and recognition system was tested, describes the test conditions and environment, and drawn based on machine learning algorithm to detect the P2P business accuracy and CPU share.A summary of the full text of Chapterâ…¥, made for future work prospects.
Keywords/Search Tags:P2P, Ration between upload traffic and download traffic, Data mining, Machine learning, Traffic Identification
PDF Full Text Request
Related items