Font Size: a A A

Design And Implementation Of Phishing Website Detection System Based On Linear SVM

Posted on:2020-12-20Degree:MasterType:Thesis
Country:ChinaCandidate:W T WangFull Text:PDF
GTID:2428330575454161Subject:Industrial engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of the information age,network security has also received more and more attention.The emergence of e-commerce-related products today has made the issue of personal information security serious.Therefore,we must respond accordingly.In this information age,it is easier and more convenient to detect phishing websites in a smarter way.Phishing websites have always been one of the problems that need to be solved in network security.It is highly concealed,but the losses are often large.For the research of phishing websites,many scholars classify phishing websites and normal websites through machine learning algorithms.Based on the classification algorithm commonly used in phishing website detection,this paper compares the URL features and page content characteristics of the website,and designs and implements a high-performance phishing website detection system.The main work is as follows:1)Firstly,this paper analyzes a complete phishing website attack case,and then combines the current phishing website detection technology,including black and white list library detection mechanism,heuristic phishing website detection mechanism and visual similarity detection mechanism.And the advantages and disadvantages of the above detection mechanism are compared and summarized.2)Then through the research and development trend of phishing websites,the phishing website detection engine is designed and implemented,including the black and white list detection mechanism,the principle of several query algorithms is analyzed,and the best algorithm is optimized.The blacklist detection mechanism is mainly The job is to directly filter a large number of authenticated websites and reduce system performance overhead.The second part of the engine is the URL detection mechanism.By collecting the URLs of current phishing websites and analyzing the characteristics of these URLs,the URL characteristics of 11 phishing websites are obtained.It is trained and classified by logistic regression algorithm.The last part is the detection of page content features.It is found that the linear SVM algorithm performs well for data classification of high latitude small data sets,and the best number of page content features are obtained through experiment comparison.3)Finally,the overall architecture design and system deployment mode of the phishing website detection system and the running performance of the system are introduced.The performance problem of the system's most service system is considered in the architecture design,and the high-performance service system architecture is designed and implemented.In the deployment,the Nginx reverse proxy server is proposed,and the principle is analyzed to load balance the whole system.The system performance test first trains each detection,compares the common classification algorithm for the performance in the page detection system.Then,the system is tested as a whole to get the final test results.
Keywords/Search Tags:Linear SVM, Phishing website, Feature extraction, System design
PDF Full Text Request
Related items