Font Size: a A A

Research On Phishing Detection System Based On Adaboost

Posted on:2016-05-25Degree:MasterType:Thesis
Country:ChinaCandidate:S Q LiFull Text:PDF
GTID:2308330464462437Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Phishing is a fraud crime, which obtains the user password by masquerades as a trusted email or website. Today, phishing attacks become more diversified, make the prevention and detection of phishing attacks more difficult. According to statistics, the loss of phishing attacks in recent years rising double type. Therefore, phishing has become one of the most important factors of endangering network security. It not only reduces the trust between people and people in the network, but also seriously hampered the development of electronic commerce.The current problems of phishing detection techniques, including the detection of the level of individual, access to information is not comprehensive. Therefore, this paper proposes a URL black and white list filtering combined with machine learning(Ada Boost algorithm) detection method. The main work is as follows:Firstly,an unknown site through URL black and white list filtering, if the matching is successful, also output a result, but fails, then use the classifier to detect. Through this method, the poor timeliness fishing website can be fast to detect. The new form of fishing website can use machine learning to detect.The key of classifier is how to extract features. In order to obtain enough fishing website information, this paper extracts fourteen features from the URL, extracts five features from the structure of the webpages, and extract a large number of features from the content of webpages for training and testing the classifier.Features may be mix a lot of nosie, and the problem of higher dimension, therefore, using the data preprocessing module to reduce dimension and remove noise.By comparing the detection performance of k NN algorithm, naive Bayes algorithm, logistic regression and Ada Boost algorithm, finally, Ada Boost algorithm is selected as the detection method in this paper. For exsiting non-equilibrium price problem of phishing detection, the paper proposes an improved algorithm named Ada Cost Boost, the experimental results show that the improved algorithm guarantee the detection precision while reducing normal website misjudgment rate, reduce the impact caused by misjudgment, improves the possibility in the practical application.
Keywords/Search Tags:Phishing, characteristics, SVD, AdaBoost, cost
PDF Full Text Request
Related items