| Phishing is an identity theft attack which criminals use technology to create fake websites impersonating well-known organizations,directing users to fraudulent pages to steal their critical and private information.In response to illegal phishing attacks,domestic and foreign researchers have carried out a lot of anti-phishing research.With the continuous improvement of phishing attackers’ capabilities and phishing attack methods,heuristic and machine learning-based phishing website detection technologies have gradually become the research focus of domestic and foreign researchers.At present,almost all phishing website defense methods passively wait for the client to provide detection domain names,and there may be many undiscovered phishing websites that have been registered and used for malicious behavior.Therefore,it is an important way to defend against phishing websites by proactively detecting possible phishing domain names and checking whether these domain names have been registered and implemented for phishing attacks.This thesis proposes a phishing website discovery algorithm based on active detection,which can generate a new phishing website domain name according to the target website domain name and the phishing website targeting the domain name.By analyzing the relationship between the target domain name and the phishing website domain name,as well as the relationship between the phishing websites with the same target domain name,the algorithm designs a generation algorithm for suspicious domain names of phishing websites,generates more diverse and effective phishing website domain names,and improves the performance of the Phishing site discovery efficiency.And conduct DNS detection and phishing website detection on suspicious domain names of phishing websites,and finally discover new phishing websites.Further,according to the algorithm,a phishing website discovery system based on active detection is designed and implemented.The system includes a data collection module,a data persistence module,a suspicious domain name generation module for phishing websites,a DNS detection module,a phishing website detection module and a result display module.This thesis uses the collected public phishing website domain names and target domain names as datasets,uses the target domain name as the algorithm and system input for testing,and measures major websites at home and abroad.The experimental results show that compared with the existing methods,the detection rate and survival rate of the phishing website discovery algorithm under the collected phishing website data set are increased by about 10%.effective ways to improve cyber defense capabilities. |