Font Size: a A A

Research On Detection Technology Of Malicious Domain Name Based On Neural Network

Posted on:2022-12-22Degree:MasterType:Thesis
Country:ChinaCandidate:Z Y LinFull Text:PDF
GTID:2518306779996419Subject:Automation Technology
Abstract/Summary:PDF Full Text Request
In recent years,with the continuous innovation and development of big data,artificial intelligence,cloud computing and other technologies,people's lives are more convenient and intelligent.However,with the convenience brought by technological innovation,network attacks have become more and more frequent,and have shown a rapid upward trend.Among them,the destructive power and influence of hacker groups with botnets as the main attack method Especially.In order to avoid blacklist detection,today's botnets usually use domain name generation algorithms(DGA)to generate a large number of domain names in a short period of time,and then use these domain names for connection and communication,thereby controlling large-scale infected hosts and making them become The "meat machine" in the hands of the attacker launches a large-scale attack on the target's network.Therefore,the detection of malicious domain names generated by DGA is of great significance for preventing botnets.With the widespread use of deep learning in various fields,the detection of malicious domain names has also turned from traditional feature extraction combined with machine learning to applying deep learning technology to detect and classify malicious domain names.However,due to the huge amount of malicious domain name family data,there is an imbalance in the number of different types of families.In addition,in recent years,some families will generate domain names by means of word dictionaries,and the detection of malicious domain names for this type is also a big problem.Therefore,in response to the above two questions,the specific research work carried out in this thesis is as follows:(1)In view of the problem of unbalanced domain name family data,the main work of this thesis is to combine the advantages of CNN and RNN in terms of data feature extraction by combining the advantages of the two,so that more effective feature information can be extracted in a targeted manner,so that the entire The detection method can be more effective in effect,and at the same time,the problem of data imbalance of malicious domain name families is solved by introducing the Focal loss loss function.Through the comparison of binary classification and multi-classification experiments on public datasets,the effectiveness and feasibility of the detection method in this thesis are verified.(2)Aiming at the problem of poor classification effect of dictionary-based malicious domain name families,the main work of this thesis is to extract 2-gram feature vectors for domain name data by using N-gram model,and combine character vectors to form mixed vectors.In terms of neural network feature extraction,based on the classic Text CNN model,the convolution model is improved by borrowing the classic Inception structure proposed by Google.Under the multi-scale convolution,the depth of the convolution layer is deepened,and BN is introduced to reduce the parameters of the model and the amount of computation.Then use the self-attention mechanism layer to give different information in the domain name data with different weights,remove the information noise,and finally replace the fully connected layer with the global average pooling layer to improve the generalization ability of the model,and then pass the model To verify the effectiveness of the method in binary classification and multi-classification.The innovations of this thesis are as include:(1)A malicious domain name detection method based on CNN-BiGRU is proposed.This method makes full use of the different capabilities of CNN and RNN in feature extraction.While ensuring the best two-classification effect,it also introduces the Focal loss loss function to solve the problem of data imbalance in malicious domain name families.Compared with the single CNN and RNN models,the detection accuracy is 0.12% and 0.19%higher in the two-class experiment,and 2.89% and 1.43% higher in the multi-class experiment.(2)A malicious domain name detection method based on Attention-CNN is proposed.The method uses the N-gram model to extract the mixed vector for the domain name data,and then improves the convolutional network layer and combines the self-attention mechanism to solve the problem of poor classification effect of the dictionary-based malicious domain name family.In the binary classification experiment,the detection accuracy rate is 99.07%,which is the best result in the comparison model.At the same time,it also achieves a good classification effect for the newly emerged malicious domain name family in the multi-classification,especially in the performance of the dictionary-based domain name family model.
Keywords/Search Tags:malicious domain names, domain generation algorithm, deep learning, botnets, attention mechanism
PDF Full Text Request
Related items