Font Size: a A A

Research On Malicious URL Detection Based On Deep Learning Algorithm

Posted on:2021-04-13Degree:MasterType:Thesis
Country:ChinaCandidate:H H WangFull Text:PDF
GTID:2518306128976619Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
In recent years,malicious URLs have been increasing and diverse,with small differences and difficult to distinguish,which makes research on malicious URL detection more difficult.Most of the existing research on malicious URL detection is based on traditional,single machine learning algorithms.These algorithm functions are relatively simple to calculate and cannot automatically obtain more URL features for learning analysis.The generalization ability is not ideal,and the accuracy of malicious URL detection research is not effective.In this paper,the malicious URL dataset Phish Tank and crawler crawling benign URLs are combined into a dataset as a research basis.Multi-faceted technology is used to extract multiple features,learn and study deep learning algorithm models,and improve the malicious URL detection research used in this paper.The experimental data obtained through various comparative experiments show the effectiveness of the research method proposed in this paper on malicious URL detection.The research contents of this article are as follows:1.A method for detecting malicious URLs based on a Bi-Ind RNN(Bidirectional Independent Recurrent Neural Network)algorithm is proposed here.At the same time,combined with the URL string itself,the host information feature and the URL information feature are extracted,and the two types are fused.The Bi-Ind RNN algorithm is used for malicious URL detection research,where softmax is used for classification.A variety of comparative experiments were performed,and the experimental data showed that the Bi-Ind RNN algorithm significantly improved the results of malicious URL detection.2.Here,a research method of malicious URL detection based on Bi-LSTM(Bidirectional Long Short-Term Memory)algorithm is proposed.At the same time,it combined with image processing technology to extract malicious URL texture fingerprint features and extract URL static vocabulary features.The Bi-LSTM algorithm is used for malicious URL detection research,where softmax is used for classification.A variety of comparative experiments were performed,and the experimental data showed that the Bi-LSTM algorithm significantly improved the results of malicious URL detection.3.A bidirectional LSTM algorithm(cnn?Bilstm?Ind RNN)based on a convolutional neural network and an independent recurrent neural network is proposed here.At the same time,combined with image processing technology,the malicious fingerprint texture fingerprint features were extracted.Combined with natural language processing technology,the word vector tool word2 vec was used to train to obtain the URL word vector features,while extracting URL static vocabulary features.The cnn?Bilstm?Ind RNN algorithm is used for malicious URL detection research,where softmax is used for classification.A variety of comparative experiments were performed,and the experimental data showed that the cnn?Bilstm?Ind RNN algorithm significantly improved the results of malicious URL detection.4.A convolutional neural network combining bidirectional independent recurrent neural network and parallel joint algorithm model(CATTB)combining attention mechanism is proposed here.At the same time,using the rules of URL,the URL is boxed to extract the relocation features.Combined with image processing technology,texture fingerprint fingerprints of malicious URLs were obtained.Combined with natural language processing technology,the word vector tool word2 vec was used to train to obtain the URL word vector features,while extracting URL static vocabulary features.The CNN(Convolutional Neural Network)is used to extract deep local features,followed by the Attention mechanism to adjust the weights and bidirectional Ind RNN to extract global features,and finally use softmax for classification.A variety of comparative experiments were performed,and the experimental data showed that the CATTB parallel joint algorithm significantly improved the results of malicious URL detection.
Keywords/Search Tags:Deep Learning, Malicious URL, Relocation, Texture Fingerprint, Word Vector
PDF Full Text Request
Related items