| As the Internet grows rapidly,the network security situation is becoming increasingly critical.DNS,as a weak infrastructure of the Internet,is easy for attackers to cause high risk to ISPs and users at low cost.Many studies have already been proposed to strengthen the security of DNS,e.g.,DNS over HTTPs(Do H).However,some of them were born with design flaws,others are so new that they would take a long time to deploy.Therefore,DNS is still used worldwide for domain name resolution.Therefore,it is especially important to detect anomalous traffic in DNS requests in a timely and accurate manner.In this thesis,we focus on DNS anomaly traffic detection.We cluster DNS traffic and provide accurate user class information based on an unsupervised model.Based on the time series analysis model,we predict the user traffic features to provide accurate reference values for DNS online detection.Based on the user information and feature reference values,we study the online detection method of DNS abnormal traffic.The main work of this thesis is as follows:(1)Evaluate and optimize the clustering model to address the problem of low precision of unsupervised models.By analyzing the nature of the dataset,we select traffic features from each layer in TCP/IP that can help distinguish user class.After that,we complete the processing of raw traffic data.The clustering performance of X-means,Gaussian mixture model and DBSCAN model is evaluated using internal evaluation metrics,and the feature selection algorithm is used to provide optimal feature inputs for the clustering model.To address the problem that the clustering results do not reflect the user type,we propose an algorithm to map the cluster labels to user labels(CL2UL).By composing Gaussian mixture model and CL2 UL algorithms,we achieve 95% detection accuracy and 96% recall rate.(2)In view of the fluctuation of user traffic data,this thesis proposes an automatic parameter optimization method for the SARIMA model,so that the optimized model can fit and predict user characteristics more accurately,and provide more accurate reference values for online detection of DNS abnormal traffic.Compared with the manual optimization of model parameters,the proposed method improves the ability of fitting and prediction of user feature values.(3)Aiming at the problem of online detection of abnormal DNS traffic,this thesis proposes a threshold-based online detection of anomalous DNS traffic(ODADNS)based on the user cluster information obtained from the clustering model and the user characteristic values predicted by the time series analysis model.The algorithm is compared with decision trees,naive Bayes,support vector machines and other existing works through simulation experiments.The results show that the ODADNS algorithm proposed in this thesis is similar to decision tree and plain Bayes algorithms in terms of its ability to detect known anomalies.The online detection consumes less time than other existing works.In terms of unknown anomaly detection,the ODADNS algorithm significantly outperforms other existing works and shows stronger generalization ability than other algorithms. |