Font Size: a A A

Research And Implementation Of Heuristic Detection Technology Of Phishing Website

Posted on:2018-10-07Degree:MasterType:Thesis
Country:ChinaCandidate:Z ZhangFull Text:PDF
GTID:2348330533969611Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Phishing websites release misleading information to entice netizens to submit their personal information and to further steal their property,which is one of the most common network attack to netizens.In order to enhance the accuracy of phishing website detection and weeken the dependence of third-pard tools and resource,this paper researches on technologies of heuristic detection and topic recognition on phishing website.First of all,this paper researches on pro-process technology of website,in view of data scratching and storing problem,this paper proposes an update-storage strategy to crawl and store phishing website released on the third platform at regular intervals.In terms of website text feature extraction problem,we use the m-Text Rank algorithm to extract and store website keywords.To improve detection precision and stability,optimization is conducted by identifying new features timely and selecting the best feature subset.This paper proposes a novel multi-layer heuristic anti-phishing model composed of feature extraction layer,feature selection layer and heuristic classification layer.Five feature selection algorithms are applied to pre-process the feature sets.Then three classification algorithms based on decision tree are applied to identify phishing websites and legitimate websites.Experimental results show that the proposed model utilizing IG algorithm in procedure of feature subset selection and RT algorithm in heuristic classification achieves 96% accuracy and 95% recall rate with less time cost.In the light of the relevant issues between a webpage topic and its legislation and the distribution of phishing website,this paper propose a theme recognition algorithm based on LDA-SVM.Though pre-processing,Gibbs sampling,LDA modeling,SVM classification and effect evaluation,we construct a LDA-SVM theme classification model.Experimental results verify that the accuracy rate of theme recognition achieves 93%.Next,we use this topic classification model to indentify the topic of phishing detected by detection model above,further this indentification result can provide evidence for the detection result.On the basis of researches above,we implement a multi-layer heuristic system to detect phishing websites.For a new URL,this model can store its webpage recources,detect and classify its legitimacy in time and identify the webpage theme accurately.The system test results testify that the system functions can meet the detection requirements of the legitimacy of unknown sites and the system performance appeals to our expectation.
Keywords/Search Tags:Phishing webpage, LDA-SVM theme recognition, decision tree, heuristic detection
PDF Full Text Request
Related items