Font Size: a A A

Research On De-Weighting Method Of WEB Vulnerability Detection Based On DBSCAN Algorithm

Posted on:2020-04-29Degree:MasterType:Thesis
Country:ChinaCandidate:Y F JiaFull Text:PDF
GTID:2428330575453254Subject:Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet,the network platform has become the main medium of communication for today's social information,and people's lifestyles have also undergone great changes,which are reflected in shopping,education,communication,etc.;At the same time,hackers' attacks on websites seriously threaten people's interests.If you can quickly scan the vulnerabilities of the website and give defenses before being attacked by hackers,it will greatly improve the security of the website,and it will be of great value in practical applications.At present,the web crawler used by the traditional web vulnerability detection system is not complete enough to crawl the website page data,And there is no improvement for the anti-crawl mechanism,which resulting in the feasibility of vulnerability detection is reduced and the false negative rate is increased;At the same time,the page data collected by the web crawler is not deduplicated by an efficient and accurate algorithm,which causes the scanning system to decrease in speed.This scanning method has not met the needs of the contemporary security industry,and a new solution needs to be proposed.Automated web crawlers based on simulated human-computer interaction better simulate human behavior,can more comprehensively analyze the structure of the website,better process and collect website data,and at the same time can resist the anti-climbing mechanism to a certain extent;density-based DBSCAN The clustering algorithm can better characterize the characteristics of each page and the difference between other pages,and achieve the purpose of fast and accurate deduplication.Therefore,this paper combines human-computer interaction web crawler and DBSCAN clustering algorithm to study Web vulnerability scanning method.The specific research contents are as follows:(1)Using Selenium(browser automated test framework)combined with Chrome Headless browser as an important part of the web crawler model,Selenium simulates the behavior of real users,using Chrome Headless as a real interfaceless browser.Experiments show that the crawler framework can better describe the behavior of a user operating on the browser,bypassing the anti-climbing mechanism to a certain extent,and more comprehensive analysis of the structure of the website,collecting more comprehensive data.Improve the feasibility of vulnerability detection and reduce its false negative rate.(2)A large number of website page data were collected for the web crawler,and the density-based DBSCAN clustering algorithm was used to perform web page similarity clustering on the characteristics of the website page data.Experiments show that the deduplication method solves the problem of page deduplication better than the regular-based algorithm and speeds up the vulnerability detection.(3)Combining the human-computer interaction web crawler and DBSCAN clustering algorithm,the SQL injection and XSS vulnerability detection plug-in is also called to design a Web vulnerability high-efficiency detection system.Through testing the system and the AWVS vulnerability detection system,experiments show that the system has better detection results.
Keywords/Search Tags:Web vulnerability detection, human-computer interaction, web crawler, density clustering, vulnerability detection
PDF Full Text Request
Related items