Font Size: a A A

Research And Implementation Of A DGA Malicious Domain Detection Method

Posted on:2022-02-08Degree:MasterType:Thesis
Country:ChinaCandidate:L L ChenFull Text:PDF
GTID:2518306494973069Subject:Mathematics
Abstract/Summary:PDF Full Text Request
Domain Name System has developed into a network infrastructure and information hub that cannot be ignored,and it is an indispensable part of people's daily network activities.However,online malicious domain names appear more and more frequently,which has a malicious impact on Domain Name System,and at the same time brings various losses to the country,society and people's lives.Therefore,the detection of malicious domain names is becoming more and more important.Currently,most malicious domain detection methods are based on blacklist matching and machine learning methods.The blacklist matching method relies on the list of malicious domain names that have been detected,which is weak in timeliness and slow to update;methods based on machine learning are more popular.Currently,domain name character features are generally used as features to train models for classification.However,with the update of the malicious domain name generation algorithm,more and more malicious domain names are less different from normal domain names.By changing some characters in words to imitate normal domain names,the classification of characters as a feature becomes indistinguishable..This paper proposes a detection method based on improved character features as the basis for classification.At the same time,this article divides the malicious domain names into specific categories and proportions according to the malicious inducing content of malicious domain name websites.The specific content is as follows:(1)This article first analyzes the difference in character composition and distribution between malicious DGA domain names and normal domain names.Based on this,nine basic characteristics of the domain name are selected for model training and experimentation,which provides a basis for the subsequent domain name classification based on improved character features.in accordance with.Then it analyzes the improved characteristics of the domain name characters,that is,the lexical features,uses the support vector machine algorithm to train the classification model,and then tests to obtain the domain name detection results.Compared with the classification method using the original character features,the accuracy rate is improved by 0.7%,and the accuracy is improved.Increase by 0.6%.(2)On the basis of the above classification and detection,the effect of short domain name classification is studied,and a classification method that adds HMM features to the SVM model for training is proposed,and the trained model is verified,and it is found that the method is effective Both long and short domain names have good results.The final result has an accuracy rate of over 95.4%,a recall rate of over 96.4%,and a precision of over 94.4%.Compared with the original character features,the accuracy and precision are improved by more than 1%.(3)This article further categorizes malicious domain names,using web page request filtering,web page title keywords to match the dictionary database,sub-link information,etc.to do cluster analysis of unmarked domain names to divide the malicious domain name data set in this article into 15 Category,and get the distribution ratio of each category,which can meet the classification of massive domain name data.In this article,when detecting malicious domain names,changing the original character features to word segmentation features can significantly improve the detection effect of new DGA domain names,and the addition of the HMM coefficient makes it possible to detect DGA domain names with shorter characters that are difficult to detect.effect.The cluster analysis of malicious domain name websites makes a wide range of malicious domain names classified into specific categories,and realizes the classification of massive data and various types of malicious domain names.
Keywords/Search Tags:malicious domain name detection, DGA, character feature, HMM
PDF Full Text Request
Related items