Font Size: a A A

Research On Malicious Domain Name Detection Method Based On Deep Learning

Posted on:2021-05-03Degree:MasterType:Thesis
Country:ChinaCandidate:Y Z YangFull Text:PDF
GTID:2518306047986709Subject:Cyberspace security
Abstract/Summary:PDF Full Text Request
In response to IP address blocking and DNS sinkholes,modern botnets have begun to use domain generation algorithms(DGA)to dynamically generate malicious domain names as rendezvous points for communication between infected hosts and the controller.Detecting malicious domain names can find infected hosts in a timely manner to ensure users' online security,and also help botnet tracking and tracing.The existing malicious domain name detection methods still have two problems.One is that the malicious domain names generated by wordlist-based DGA cannot be effectively detected.The other is that the detection model may be deceived by adversarial samples,the model robustness needs further analysis.Aiming at these problems,this thesis carried out the following research work:(1)Domain names preprocessing based on n-gram and word2 vec.By comparing the ngram distributions,the difference between malicious and benign domain names is analyzed,and it is found that the difference between the two character combinations becomes more and more obvious as the value of n increases,which will help the detection model to distinguish them.Therefore,the domain name is segmented based on n-gram.However,when the value of n is too large,it will not only increase the amount of calculation,but also lead to dimensional disaster.In order to solve this problem,the range of n is determined according to Ziff's law.Then,training the skip-gram model on the segmented domain name corpus with the word2 vec tool,so as to convert the domain name into a numerical vector that can be processed by the computer.(2)Malicious domain name detection model based on attention convolutional network.For the problem that the sliding window of the convolutional neural network cannot capture long distance dependence,an attention mechanism is introduced.A context vector is constructed to represent the relationship between the current word and other words in the sequence.Then the input vector and the context vector are fed together into the convolution module.In order to fully extract the latent features of the domain name,the designed convolution module contains three parallel convolutional layers with different kernel size,and each convolutional layer is followed by a maximum pooling layer.The experimental results show that the constructed detection model has a recall of 86.03%,88.31%,91.57% for malicious domain names generated by three kinds of wordlist-based DGA(gozi,matsnu,supobox).At the same time,compared with other models,the overall detection effect also has different degrees of improvement.(3)Robustness analysis of the detection model.In order to evaluate the model robustness,an adversarial attack method based on generating adversarial networks is proposed.This method first learns a latent space that is independent and identically distributed with the sample set through the game between the generator and the discriminator.Then,the inverter is used to map the original sample to the latent space and add perturbations.Finally,the generator generates an adversarial sample,and the discriminator guarantees the similarity between the adversarial sample and the original sample.Experimental results show that the four deep learning-based malicious domain name detection models all show vulnerability to the generated adversarial samples,with an average recall of only 51.16%.Afterwards,the adversarial samples are used to augment the training set.The model robustness is enhanced through adversarial training and the recall of the adversarial samples is increased to more than 85%.Simultaneously,it is found that the recall of malicious domain names generated by the wordlist-based DGA also increases slightly.
Keywords/Search Tags:domain generation algorithms, malicious domain name detection, attention mechanism, adversarial examples
PDF Full Text Request
Related items