Font Size: a A A

Malicious URL Detection And Research Based On Deep Learning

Posted on:2022-07-20Degree:MasterType:Thesis
Country:ChinaCandidate:X X YangFull Text:PDF
GTID:2518306557968199Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the popularization and application of information technology all over the world,the Internet provides people with efficient and convenient services,but at the same time it also brings many security risks.Massive amounts of personal information and corporate data are exposed on the Internet,making it possible for cybercrimes.URL,as the access entrance of Internet applications,is often attacked by various networks and faces many security risks.Therefore,screening malicious URLs in the network is of great significance to maintaining network security.After a lot of research on malicious URL detection methods,this article applies deep learning technology to malicious URL detection.The specific research work is as follows:1.Aiming at the particularity of URL grammar,a set of text preprocessing and feature representation methods are designed.According to the URL words and characters and context information,this method uniformly expresses the URL string as a feature matrix,and realizes the vector expression of the URL string.2.Propose a malicious URL detection model based on automatic feature extraction and deep neural network.Combine URL word vector with neural network,mine the deep detailed information of URL string and contextual information,and extract hidden abstract features.And an attention mechanism was added to strengthen the model's processing of key information and the suppression of irrelevant information,and finally a CBLA(CNN-Bi LSTM-Attention)model was built to detect malicious URLs.Experimental results show that the proposed model can effectively deal with complex abstract problems,and has better performance in detection accuracy and detection efficiency.3.Aiming at the problem that URL word vectors are difficult to achieve dynamic representation,it is proposed to use a BERT model based on large-scale corpus training as a feature extractor to obtain dynamic URL word vectors.At the same time,in order to improve the generalization ability of the word vector generated by the BERT model,deep pre-training was carried out in combination with the existing URL data set.Secondly,input the word vector generated by the BERT model into the Bi LSTM neural network to strengthen the model's ability to learn the long-term dependent information of words,and finally build a BBL(BERT-Bi LSTM)detection model.The experimental results show that the proposed model has improved accuracy and F1 value.
Keywords/Search Tags:Deep learning, Malicious URL detection, Attention, BERT
PDF Full Text Request
Related items