| In recent years,with the rapid development of the Internet,there has been a growing number of cyber attacks,which come in various types,such as Distributed Denial of Service(DDo S)attacks,phishing attacks,and so on.To carry out these cyber attacks,hackers often use DGA(Domain Generation Algorithm)domain names.DGA is a technique that utilizes algorithms to generate random domain names,which are frequently used to hide the IP addresses of the Command and Control(C&C)servers for malicious software.Attackers use DGA domain names to communicate with control servers and create botnets for cyber attacks.Since DGA domain names are generated randomly,this increases the complexity of attacks,making them more difficult to detect and prevent.Therefore,timely detection of DGA domain names can effectively prevent cyber attacks.Currently,malicious domain detection technology has evolved from traditional manual feature extraction to methods based on deep learning.The existing deep learning models have the problem of not being able to detect malicious domain names comprehensively,and the datasets used in detecting DGA domain names has the problem of fewer categories and unbalanced samples.To address these two problems,the main work of this paper is as follows:(1)To address the problem of incomplete detection of existing DGA domain name categories,we design the DGA domain name detection algorithm APCNN-Bi LSTM-ATT,this algorithm is based on the analysis of domain names in the public benign domain name datasets Alexa and the DGArchive datasets containing multiple categories of DGA domain names.Firstly,we preprocess the domain name text using character padding and character embedding.Then,we use the parallel Convolutional Neural Network with attention mechanism named APCNN and the Bidirectional Long Short-Term Memory network with attention mechanism named Bi LSTM-ATT for deep feature extraction respectively,and we use the fully connected layer to fuse the features and output the classification results.Finally,the model proposed in this paper is used to conduct domain name binary-classification and multi-classification experiments.The experimental results show that compared with other deep learning algorithms such as CNN,LSTM-ATT,and Bilbo-Hybrid model,the accuracy and F1 value of the APCNN-Bi LSTM-ATT domain detection algorithm proposed in this paper are the highest.(2)To address the problem of the imbalanced samples in the existing malicious domain name datasets,this paper uses an improved focalloss function as the loss function of the APCNN-Bi LSTM-ATT algorithm,based on which it can focus more on the small number of samples.The experimental results show that in the multi-classification experiments,compared with other deep learning models and the APCNN-Bi LSTM-ATT algorithm using the Cross Entropy loss function,the proposed model in this paper improves the training speed and achieves a certain degree of improvement in precision,recall and F1 value for domain names with fewer quantities and some difficult-to-detect ones. |