Font Size: a A A

Design And Implementation Of Malicious Access Identification System Based On URL

Posted on:2020-09-23Degree:MasterType:Thesis
Country:ChinaCandidate:M Y LiFull Text:PDF
GTID:2428330572973711Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the growth of network traffic,effective identification of malicious access becomes one of the network security issues which need to be addressed.Most of the existing detection methods are based on the domain name blacklist,ignoring the possibility of malicious access in the non-blacklists.Therefore,how to identify malicious access effectively is of great significance to improve the security of the network.From the perspective of whether the user's access to a URL is malicious,a URL-based malicious access detection model by using time series analysis method was proposed independent of the existing blacklist.Meanwhile,a malicious access identification system is designed and implemented based on this.The work of this thesis consists of two parts.First the performance was studied and quantified characteristics of malicious access from multiple dimensions by the user accessing the URL log of a domain name,such as domain name access similarity,information entropy and power spectral density.And combined with manual marking,this paper modifies the inaccurate sample markers near the critical points in the results of GMM.Then a malicious access detection model was generated with the safe semi-supervised supported vector machine.Secondly,according to the proposed detection model,a malicious access system based on URL identification is designed and implemented,including training sub-system and identification sub-system.This system can detect the high frequency malicious access behavior through the uploaded URL access log.And at the same time,the system can use the integrated annotation tools to continuously inject strong labels of manual feedback into the identification model for optimizing the detection model.The experimental results show that the model can detect 89.6%of the access type correctly when the false positive rate of data set is 1%.Experiment data set is composed of 20000 samples and one third of the training set is labeled.In conclusion,the model has a higher accuracy of detecting malicious access.
Keywords/Search Tags:malicious access, time series, Gaussian mixture model, semi-supervised supported vector machine
PDF Full Text Request
Related items