Font Size: a A A

And Design Of Anti-Phishing System

Posted on:2014-02-26Degree:MasterType:Thesis
Country:ChinaCandidate:G L TanFull Text:PDF
GTID:2248330398971577Subject:Computer technology
Abstract/Summary:PDF Full Text Request
All the time since, virus and Trojan horse were considered a threat to network security as the main factors, however with widely used of the Internet, phishing is emerging as a form of attack, showing a rising trend year by year, so phishing for cheating is also getting more and more rampant. This " phone-fishing " form of fraud using high simulation website, mix the spurious with the genuine, and steal the network user’s private information and the financial information, then obtain the corresponding commercial interests, which is a serious threat to the online transactions and electronic commerce.At present, the main anti-phishing technology is the list detection and page similarity detection. But the black list detection has lagged behind, and page similarity detection is in the presence of low detection rate of defect. After depth analysis of phone-fishing website and comprehensive existing anti-phishing technology, design a high detection rate, low false alarm rate phishing detection system based on cosine theorem page similarity matching, and increase the domain list module based on URL splicing and unknown phishing detection module, and the domain black list module improves the detection rate. Based on the support vector machine (SVM) the feature template classification improves the efficiency and accuracy of the template classification. The main work of this thesis includes:1.Page similarity matching based on the cosine theoremConvert Html into the DOM tree structure, then word segmentation and noise removal treatment, and use the TF-IDF algorithm to extract the DOM tree score high word as a feature. With the cosine theorem to calculate the page and feature similarity, setting a threshold value, determine whether the fishing website or not. 2.Classification feature template’s design based on the support vector machineCurrent phone-fishing templates include categories, brands and feature words such as field, with the existing fishing feature template as a training set. Firstly feature template pretreatment, including stop words and noise removal, and use the TF-IDF algorithm into the vector space model, then construction template SVM classification model. With regular learning, the new phishing template is automatically classified. Because making fishing template needed manual classification before, heavy workload and low accuracy rate resulted in a work resource waste. Now use this module can save time, improve efficiency and accuracy in template classification.3.Unknown phishing detection based on the URL splicingAs a result of phishing web paths are relatively concentrated, domain name from a DNS server and commonly used fishing path method of stitching, tests new unknown types of phishing website, which can reduce the loss to the minimum. Some large business transaction sites such as Taobao,Paypal,Ebay, or easy-imitation domains can be targeted with this splicing method.Through the realization of the main function, improve the accuracy in the detection of phone-fishing page, and reduce the false alarm rate. At the same time by researching the latest phishing sites, improve detection rate of the anti-phishing system.
Keywords/Search Tags:phishing, similarity, active-detection, support vector machine
PDF Full Text Request
Related items