Font Size: a A A

Research Of P2P Traffic Identification Based On Transer Learning

Posted on:2015-05-31Degree:MasterType:Thesis
Country:ChinaCandidate:L CaiFull Text:PDF
GTID:2298330467963423Subject:Military communications science
Abstract/Summary:PDF Full Text Request
With the development of P2P-based Internet applications and the rapid increase in users, the network is facing increasing pressure in the construction and. maintenance because of the P2P technology in network resource consumption. How to manage P2P applications to enable it make a healthy development in the existing network is the focus of experts and scholars’attention.P2P traffic identification is the basis for managing P2P applications, the research has not been interrupted. The major identification algorithms based on the port detection, content-based scan or traffic characteristics. These technologies solved the part of the problem of P2P traffic identification, but each one have its own flaws.Machine learning is a popular field in today’s computer science research. Machine learning algorithms can obtain the law automaticly from analyzing the data and use the law to predict the unknown data. This is very suitable for P2P traffic identification. There are many machine learning algorithms can identify P2P traffic effectively, but they are based on a large number of hand-labeled training samples. When the network conditions change rapidly, these samples are difficult to reuse.This, paper studies solutions of the P2P traffic identification in the new machine learning framework, transfer learning, combining traditional machine learning algorithm. This new algorithm can get better recognition accuracy in the case of a small number of hand-labeled samples. The main contributions and innovations of this paper are as follows.First, this paper studied the TrAdaBoost algorithm in the field of text classification and introduced it into the field of P2P traffic identification, and improved the algorithm in timeliness. Tradaboost is a transfer learning method used in the field of text classification. This paper combined it with the characteristics of P2P traffic identification; By adjusting the weights of auxiliary data, making it more simply transfer auxiliary data to the source data, using the co-training set to train classifier. In addition, this paper removed auxiliary data dynamicly by iterator error rate.The improvement accelerate the speed of iteration, reduced time consumption. The simulation results show that the improved algorithm is more real-time.Second, this paper combined the traditional KNN method and transfer learning framework, proposed a transfer learning algorithm based on KNN method, and used it to solve the problem of P2P traffic identification, then improved this algorithm in terms of complexity. The new algorithm uses KNN to screen auxiliary data, remove the auxiliary data which is less similar with the source data, the uses better auxiliary data and source data to train a classifier. In addition, this paper used SVD to make a pre-classfication to improve KNN. The improvement reduced the KNN calculation. The simulation results show that the new algorithm is efficient and the improved algorithm is more real-time.Third, this paper set up a simple P2P traffic, identification system based on Java and Web. It enables the exchange of data sets and algorithms easier. The system used Web UI and used Java to achieve the above two algorithms, and open them. Users can upload their own data sets to be identified or download others’data sets. The system provides an effective platform for the exchange of P2P traffic identification algorithm and data sets.
Keywords/Search Tags:P2P traffic identify, machine learning, transferlearning, AdaBoost, KNN
PDF Full Text Request
Related items