Font Size: a A A

Research And Implementation Of The Key Technology On Complex Text-based CAPTCHAs Automatic Recognition

Posted on:2022-09-24Degree:MasterType:Thesis
Country:ChinaCandidate:Y X ZhaoFull Text:PDF
GTID:2518306563465594Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Text-based CAPTCHAs(Completely Automated Public Turing test to tell Computers and Humans Apart)are currently the most widely used CAPTCHA mechanism,they effectively prevent the malicious behavior of computer programs on the website,and ensure the security and stability of the network system.Research on the recognition of text-based CAPTCHAs has high application value in projects such as Robotic Process Automation(RPA),and can also promote the design of more secure CAPTCHAs on websites to ensure network security.In order to increase the difficulty of cracking,the design of text-based CAPTCHAs has undergone a transition from simple to complex.The complexity of characters,backgrounds,types,and the variable length design of text make traditional optical character recognition technology unable to meet general recognition requirements.With the development of scene text recognition(STR)technology and image classification technology based on deep learning,its algorithm simplifies the preprocessing and feature extraction steps,can well solve the above problems,and has a strong potential to be applied to complex text-based CAPTCHAs recognition scenarios.The innovation of this paper is to combine image classification technology based on deep learning with STR technology and apply it to complex text-based CAPTCHAs scenarios,research and implement complex text-based CAPTCHAs automatic recognition algorithms,and provide new ideas for text-based CAPTCHAs recognition technology.This paper summarizes four types of complex text-based CAPTCHAs,namely,idioms type,Chinese characters numerical formula type,numerical formula type,and English numbers type,based on that,we made a classification and recognition dataset for research.The main work includes three parts:(1)Based on the Residual Network(Res Net)research,the Res Net classification model is built to perform complex text-based CAPTCHAs classification tasks,and the accuracy rate is as high as 98.42%.The network avoids problems such as the disappearance of gradients through the residual structure,and realizes the full use of features through the shortcut connection,at the same time it controls the amount of calculation through reasonable network structure settings,making the classification task efficient and accurate.(2)Based on the advantages of the classic STR algorithm,convolutional recurrent neural network(CRNN),for recognition of variable length text,the Res Net network and visual attention mechanism are combined with it,this paper proposes Attention Residual Recurrent Neural Network(AR-RNN),which is used for the recognition task of complex text-based CAPTCHAs.The recognition accuracy of the four types of complex text-based CAPTCHAs has reached 93.68%,97.21%,94.43%,and 98.03% respectively.(3)Combining the Res Net classification model with the AR-RNN recognition model and adding mathematical calculation steps to successfully implement the automatic recognition algorithm on complex text-based CAPTCHAs,which achieves good results in recognition success rate and reasoning time.The automatic recognition algorithm of complex text-based CAPTCHAs is applied to the practice of RPA projects,and the CAPTCHA recognition subsystem is designed and implemented,which proves the practical value of the research work in this paper.The complex text-based CAPTCHAs automatic recognition algorithm proposed in this paper has a high recognition accuracy and an efficient network structure,which provides a higher possibility for its future application in more fields that need to crack the CAPTCHA.
Keywords/Search Tags:Complex text-based CAPTCHAs, Scene text recognition, Image classification, ResNet, CRNN, Visual attention mechanism, RPA
PDF Full Text Request
Related items