Font Size: a A A

Traffic Identification And Botnet Detection Based On Online Migration Learning And Deep Learning

Posted on:2020-10-15Degree:DoctorType:Dissertation
Country:ChinaCandidate:C S MiaoFull Text:PDF
GTID:1488306350471844Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
Traffic classification has extensive applications in network measurements,network security and quality of service.The main research topics of this dessertation include the study of network traffic classification technology based on online transfer learning and network anomaly traffic(mainly Botnet)detection based on deep learning.A new algorithm for exact multiple string matching q2-BNDM is preposed for fast annotation of data sets.The algorithm is a filtration based matching algorithm which adops multi-pattern serial-parallel strategy combining BNDM,q-grams.The algotithm has been evaluated by experiments and compared with other algorithms,for example,DFA,AC_BM and MWM.The preprocessing phase of the algorithms is very fast,the memory usage of the algorithms is fairly small,and the algorithm is considerably faster for huge sets of several thousands of patterns.The advantages of the algorithm are due to bit parallelism of BNDM and the improved filtering efficiency obtained by using q-grams and parallel connection strategy of string patterns.Feature selection of great importance to the accuracy and efficency of network traffic classification.Making use of the guidance of memetic framework,a new hybrid feature selection method combining wrapper and filter models is proposed.The new memetic algorithm combines local promotion with the traditional genetic algorithm,in which global search uses classifier accruracy as the fitness function to ensure global optimization,while local search uses joint mutual information as the evaluating indicator to accelerate the the convergence speed in search of the optimal feature set.Experiments indicate that compared with the existing methods,the proposed algorithm makes significant improvement in the number of selected features and computational complexity.Network traffic classification applying this algorithm can achieve higher classification accuracy with fewer features.Applying machine learning method based on traffic statistics to classify network traffic requires a large amount of training data to train a good enough classifier,and by this method,it is unable to solve the problem of space-time inconsistency between the training set and the test set,nor is it able to deal with the phenomenon of concept drift.In this dissertation,the online transfer learning problem is investigated based on isomorphic space of multiple source domains.In the multi-source isomorphic space,an on-line migration learning algorithm McMs-HomOTL is proposed based on reducing the distance between source domain and target domain.Using the unlabeled data of the target domain,BDA algorithm is adopted to map the source domain and target domain to a new space,and thus the distance between source domain and target domain is reduced.Then the source domain and target domain are integrated.The classifiers are combined on the labeled domain,and the classification model is extended to a multi classification model.Inspired by the idea of AdaBoost algorithm,that is,a strong classification model is constructed by the combination of several weak classifications,in this dissertation,n classifiers are trained in a single source domain and a dynamic weight is assigned to each classifier.In this way,it is able to effectively improve the accuracy of the McMsHomOTL algorithm.Experimental results show that the performance of traffic classification is significantly improved while the scalability and stability requirements of large scale networks are satisfied.Machine learning technology has extensive application in botnet detection.However,with constant changes of the forms and command and control mechanisms of botnets,selecting features manually becomes increasingly difficult.To solve this problem,a botnet detection framework based on deep learning is proposed.It automatically extracts features from the three dimensions of payload,time and space dimension,and establishes classifier through various deep neural network constructions.The algorithm does not depend on any prior knowledge about the protocol and the topology,and works without selecting features manually.Experimental results show that the proposed model has good performance in botnet detection and has ability to accurately identify botnet traffic.
Keywords/Search Tags:Traffic Identification, Deep Learning, Convolutional Neural Networks, Feature Selection, Multiple String Matching, LSTM
PDF Full Text Request
Related items