Font Size: a A A

Research On Multi-feature And Multi-model Malicious URL Recognition Detection

Posted on:2021-05-26Degree:MasterType:Thesis
Country:ChinaCandidate:Y F PengFull Text:PDF
GTID:2518306128476644Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
Nowadays,Internet technology is developing rapidly,but the network security problems that accompany it are becoming more and more serious.It is an urgent issue to efficiently identify and detect malicious URLs.Malicious URLs are time-sensitive,short and extensive,in addition,the update speed is fast.The efficiency of identification and classification can be obviously improved by making full use of URL characteristics.At the same time network security issues such as data theft and malware spreading can be solved to some extent.Based on the applicability of algorithms in the field of network security,in this paper the models and structures of deep learning algorithms are improved and optimized firstly.Then,the deep learning algorithms are used to extract various feature information of URLs.Last,the multi-model malicious URL recognition and detection methods are constructed successfully.Consequently,relevant theoretical research can be expanded in this field and practical applications can be provided with theoretical basis.The main contents of this article are as follows:(1)The URL datasets are crawled from open source websites such as phish tank using crawlers and other techniques.Later,the data is cleaned,screened,filtered and last normalized into a standardized form by data preprocessing method.(2)A variety of features available to identifying and detecting malicious URLs are analyzed.First,information theory and traditional methods are applied to mark URLs.Second,kinds of URLs features including extract lexical features,structural features,texture features,especially character distribution and frequency features are extracted and the feature vectors are generated by processing the missing feature values.It is entered into a pre-optimized model for detection of classification.Last,the algorithm model is trained by employing labeled URLs as the training set and unlabeled URLs as the test set.(3)A new recognition and detection method based on attention-based convolutional neural network and long-short-term memory network(attention-based convolutional neural network and long-term short-term memory,JCLA)is designed.Compared with the previous ones,it has more prominent advantages: the JCLA model pays attention to a variety of feature information,learning URL feature information from multiple levels to improve the detection performance;the JCLA model has the advantages of simple structure,needing no additional knowledge of statement analysis,strong robustness and generalization ability;not only improving the detection efficiency but also achieving better classification results.(4)Due to the fact that traditional vocabulary characteristics cannot fully describe the characteristics of malicious URLs and the deep neural network model's slow detection speed,a joint model(the joint model from concurrently the SRNN with multi-layer CNN,CSa C)based on slice recursive neural network combined with multi-layer convolutional neural network is proposed.This model not only has obvious performance in feature expression and detection of malicious URLs,but also outperforms traditional deep learning models in the aspect of detection speed.In addition,the more the feature dimension is,the more obvious the speed advantage of the CSaC model can show.
Keywords/Search Tags:malicious URL, texture feature, attention mechanism, convolutional neural network, slice recurrent neural network
PDF Full Text Request
Related items