Font Size: a A A

General Attack Of Text-based CAPTCHAs Based On Gabor Filters

Posted on:2018-06-03Degree:MasterType:Thesis
Country:ChinaCandidate:Z Y ZhangFull Text:PDF
GTID:2348330518998979Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Nowadays,CAPTCHA has been widely used on the Internet,which is used to protect from hackers' attack and malicious bot program to break the password.The purpose of CAPTCHA is to automatically distinguish between real users and computers,in another words,it is a kind of process that human user can be easily pass through while computer can not copy the action if do the same process.At present,the most widely used CAPTCHA is still the text-based CAPTCHA,so it is still worth discussing.Many attacks have been proposed,these fine prior art advanced the scientific understanding of Captcha robustness,but most of them have a limited applicability.In this thesis,we propose a simple,effective and general method to break the majority of text-based CAPTCHA.Our method only includes two main steps:Extracting components and Recognition.First,in the step of extracting components,we use Log-Gabor filters to extract character components from CAPTCHA images along four directions,respectively.Next,in the step of recognition,all the extracted components are sorted by the coordinates(x,y)of each component's top-left pixel,and then out attack constructs a graph according to the sorted components.Then we use graph pruning algorithm to remove the redundant nodes.After that,we recognize them.In the process of recognition,we choose KNN as our recognition alogorithom because its performace is better than CNN's.Finally,we adopt a dynamic programming approach to find the largest confidence value.That is the recognition result of the CAPTCHA images.In the process of experiment,we respectively compare graph search algorithms,extraction orientations,filters,classifiers and a comparison with prior art,and this fully prove that our attack's feasibility and superiority.Meanwhile,in order to emphasize our method's simplification,we do not use any of the preprocessing of all the CAPTCHA,which is totally different to the traditional methods.This is the first time Log-Gabor filters have been used in the process of extracting components to break the CAPTCHA,and this is a revolutionary innovation.Moreover,we choose K-nearest neighbor algorithm in the process of recognition,because it does not need to use the sample set for training so that it can save a lot of time.And it has the best performance in this process.For emphasize our method's effectiveness and generality,we break the text-based CAPTCHA of various design styles,included hollow CAPTCHA,character separated CAPTCHAand character connected CAPTCHA.In order to get different styles of text-based CAPTCHA,we choose to break the CAPTCHA from top 20 most popular websites according to the Alexa ranking,such as Google,YAHOO!,Microsoft,Amazon and etc.For all the different mechanisms of the CAPTCHAs,the success rate varies from 5% to 77%.We not only achieve a relatively high success rate,but also a breakthrough in the attack speed.The speed of each of the CAPTCHA mechanism is no more than 15 seconds.Besides,for deep analysis of the method's generality,we test our attack on the generally considered hard CAPTCHA scheme,such as an old version of re CAPTCHA,Yandex scheme and a hard Yahoo! scheme,and it achieves certain success rates.In addition,by analyzing the security of the text-based CAPTCHA,we put forward some suggestions and opinions for the subsequent design of more secure CAPTCHAs.
Keywords/Search Tags:CAPTCHA, K-Nearest Neighbor, CNN, Log-Gabor Filter
PDF Full Text Request
Related items