Font Size: a A A

Malicious Domain Detection Based On DNS Offline Response Traffic

Posted on:2018-12-24Degree:MasterType:Thesis
Country:ChinaCandidate:L WeiFull Text:PDF
GTID:2348330518499439Subject:Engineering
Abstract/Summary:PDF Full Text Request
The Domain name system(DNS)is one of the most important Internet infrastructures.Thousands of Internet applications rely on the normal resolution services provided by DNS in order to run effectively.Previous research showed that,based on the DNS protocol traffic data,the malicious domain name used by malware such as Trojan horse,botnet,blackmail program and other malicious software,can be effectively detected and identified without considering the specific communication and control protocol case.The detected malicious domain name can not only be used as a clue to track the communication behavior of malicious software,but also can provide efficient parameters for further defense measures.This thesis constructs a malicious domain name detection model.First of all,based on the DNS response offline traffic,this thesis uses the analytical procedures to extract the resource records from the offline packages,and then establishes the database with visual query interface.Secondly,by collecting the black and white lists of domain names and IP addresses issued by organizations such as Internet security vendors and security research organizations,this thesis constructs the sample data set needed for the experiment.Relying on the database and sample set,this thesis first compares the effect of SVM and random forest algorithm on malicious domain name detection from the point of view of probability model.According to the comprehensive performance evaluation of both experime ntal results,the better one will be applied as a detect means to the actual model.Then from the perspective of the probability graph model,this thesis applies the belief propagation algorithm on the probability map,constructed from the data set,to detect the malicious domain name according to the belief threshold.Finally,this thesis combines the two methods properly,and carries out malicious domain name detection to offline DNS data in real environment to verify the actual engineering effect of the model.Compared with the previous research,this thesis adds the experience knowledge of network security to the characteristic engineering of domain name in the process of applying machine learning algorithm,and makes the appropriate coding quantization of the feature.In contrast to the samples used by other studies,the samples collected in this thesis lack time-related features.Nevertheless,experiments on malicious domain name detection still achieved quite good results: the average accuracy rate takes up 91%,recall rate doesn't account for lower than 90%.The AUC of the experimental results in the ROC reaches 0.97.It proves that the generalized detection index is good.Meanwhile,in the real offline traffic,this model uses the two detection methods to carry out iterative recognition test,and the detection rate of malicious domain name can be increased from 27% of single detection mode to 37%.All the results show that,the model proposed in this thesis can be used as a clue to detect malicious do main names in large-scale offline data environment and find a feasible solution for malicious software communication activities.
Keywords/Search Tags:DNS, offline traffic, malicious domain detection, machine leanring, support vector machines, random forest, belief propagation
PDF Full Text Request
Related items