Font Size: a A A

Abnormal Traffic Detection System Based On Machine Learning

Posted on:2022-04-19Degree:MasterType:Thesis
Country:ChinaCandidate:Y ZhangFull Text:PDF
GTID:2518306506496304Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the highly rapid development of the digital information,the Internet has been deeply integrated into people's work,study and daily life.With the rapid development of the Internet,because of the security awareness lack of network users and the complex and intelligent development of network attack technologies,the security threat of network is becoming more and more serious,the network security problem has aroused people's wide concern.In civil,commercial and military field,all kinds of network application systems are facing complex network attacks and security threats.As an effective means of attack defense,abnormal traffic detection plays an irreplaceable role in detecting all kinds of network abnormalities.With the proposal of national information security strategy,there is important theoretical and practical application value to carry on theoretical and applied research of anomaly detection technology.This thesis designs and fulfils an abnormal traffic detection system based on machine learning.The system has three main parts,there are data acquisition module,data processing module and anomaly detection module.The system adopts extensible component development technology,which can identify and classify the abnormal traffic.As the input source of system data,the data acquisition module is responsible for collecting data from the real network environment.The data preprocessing module is responsible for the extraction of domain name and packet length features.The information entropy theory and N-gram algorithm are used to extract domain name related features.The extraction of packet length features uses the method of word frequency reverse file frequency transformation,which transforms the packet length feature from a single frequency distribution to a normalized frequency distribution.The anomaly detection module selects two typical sub modules which are DGA(domain generation algorithm)domain name detection and APT(advanced persistent thread)behavior detection,in the later stage,other sub modules can be added into the detecting system according to the actual demand.In the DGA domain name detection sub module,according to the comparative experimental analysis of random forest and SVM(support vector machine),it is concluded that the random forest DGA domain name detection algorithm has advantageous of higher detection rate and lower false positive rate;In the APT behavior detection sub module,through the comparative experimental analysis of naive Bayes,SVM and Random Forest algorithm,it is found that naive Bayes classifier can achieve better prediction effect in the face of unknown attacks.Through the functional test of the detecting system,it is found that the system can better fulfil malicious domain name annotation,DGA domain name abnormal information generation and model retraining.At the same time,the training and test results show that the detection effect of DGA domain name detection sub model and APT behavior detection sub model is very ideal.Finally,The system adopts extensible component development technology,which provides convenience for the subsequent addition of various anomaly detection modules,and lays the foundation for further development and improvement of network anomaly flow detection performance.
Keywords/Search Tags:DGA domain name detection, APT behavior detection, SVM algorithm, Random forest, Bayes algorithm
PDF Full Text Request
Related items