Font Size: a A A

The Research On Phishing Detection Using Webpage Noise And N-gram

Posted on:2016-10-22Degree:MasterType:Thesis
Country:ChinaCandidate:L F YinFull Text:PDF
GTID:2308330470977309Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Phishing is based on social engineering, it is a kind of malicious attack online which lie in obtaining economic benefits by means of false fraud to get access to confidential information of users. Therefore, in the face of rapid mass Phishing attacks, putting forward a kind of new type, high efficiency, high precision of phishing defense way is extremely urgent. It has the very high practical value to study of phishing attacks defense.This article makes a brief overview of the Phishing research status at home and abroad at first, and introduces the research purpose and meaning of Phishing attacks. It discusses the concept and basic attack process of Phishing attacks, and summarizes the main characteristics of Phishing attack defense techniques. The authors’main work and research results are as follows.(1) Extract the websites noise which has been pretreated from samples of phishing websites, based on PayPal and eBay Library of phishing websites database. Then combined with n-gram technology to form webpage features and compared the similarity with the protected websites. Using the result of similarity of the protected websites and phishing websites to attack detect Phishing attack. In summary, puts forward "A Phishing Detection Algorithm using Webpage Noise and n-gram". The algorithm uses the webpage for noise to webpage character description. It sets a Phishing website screening threshold. This algorithm selects webpage noise which is t less and more stable in the web page and takes the character description to the websites. Compared with other algorithms, it has lower computational cost, faster computing speed, and higher detection of timeliness.(2) Study results which from using phishing detection algorithm for phishing website database. Set the detection threshold of phishing websites for PayPal and eBay. After that, calculate precision and recall for results of the Phishing detection algorithm. From the calculation, we find that the accurate rate of this algorithm is higher and is trustworthier. The rate of precision of PayPal reached 0.8863 and the eBay reached 0.8964. What’s more, the rate of recall of PayPal is 0.8550, and eBay is 0.8229. As a result, from observing the results, we found that phishing behavior may be used the same phishing websites template by the same crime officers, and often for the same protected websites. It is a malicious crime which repeated and with the team.(3) Through gathering PayPal 2490 phishing websites and eBay 1699 phishing websites certificated and published by PhishTank, and get the Phishing webpage feature matrix by a character description of webpage use the analysis algorithm of webpage noise and n-gram fusion. After clustering analysis on the characteristic matrix, the results showed that between PayPal Phishing websites, there is 83.33% has similar similarity, and between eBay Phishing websites, there is 81.63% has similar similarity. We conclude that phishing is a highly similar or same webpage templates, and repeated the same malicious behavior for some sites to be protected, that is to say, it has a committed team.
Keywords/Search Tags:Phishing, A Phishing Detection, Web noise, n-gram
PDF Full Text Request
Related items