Font Size: a A A

Research On Chinese CAPTCHA Recognition Technology With Convolutional Neural Network

Posted on:2024-08-24Degree:MasterType:Thesis
Country:ChinaCandidate:B Z LiuFull Text:PDF
GTID:2568307163963679Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
CAPTCHA is a technology used to distinguish between human users and computer programs,and is widely used in major websites.By using different types of CAPTCHAs,websites can effectively prevent malicious attacks from automated programs,such as spamming,account brute force cracking,and large-scale crawling,etc.The research on CAPTCHA recognition technology is of great significance to improve the security of CAPTCHA and optimize the design of CAPTCHA.At present,the most widely used CAPTCHAisbased on numbers and English letters,but this CAPTCHA has a high recognition rate and can no longer meet the security requirements of website protection.Due to the difficulty of cracking the Chinese CAPTCHA and its high security,it is widely used in major Chinese websites.However,the current research on Chinese CAPTCHA recognition still faces many challenges:Firstly,there are interference information such as noise dots,interference lines,character distortion and adhesion in the CAPTCHA image,and these factors will increase the recognition difficulty.Secondly,there are many kinds of Chinese CAPTCHA characters and complex character structures,which also cause difficulties in recognition.In recent years,with the rapid development of machine learning technology,many scholars have started to use deep learning techniques applied to the field of CAPTCHA recognition,but due to the complex structure and stroke features of Chinese CAPTCHAs,existing network models cannot fully learn their character features,making the recognition accuracy rate generally not high.The main research content of this paper is as follows:(1)A detailed CAPTCHA dataset construction method and CAPTCHA pre-processing processareproposed.Different types of CAPTCHA datasets,including Chinese CAPTCHAs and numeric-alphabetic CAPTCHAs,are established through automatic code generation and web crawling to facilitate the evaluation and testing of the model performance.By proposing a complete CAPTCHA pre-processing process,including grayscale,binarization,denoising and character segmentation,the data base is provided for the subsequent recognition model.(2)A Chinese CAPTCHA recognition model based on convolutional neural networkisproposed.The convolution operation in the traditional deep learning method is difficult to extract the complete features of Chinese characters with complex structures.Therefore,this paper designs a convolutional neural network and introduces the Inception mechanism,which can extract multi-scale and different-level features in Chinese CAPTCHA in parallel by using multiple convolutional kernels of different sizes in the Inception structure.The model performance is also improved by optimizing the network structure and setting up multiple sets of experiments to select appropriate network parameters.The experimental results show that compared with the three networks AlexNet,VGG-16 and ResNet-34,the recognition rate of CAPTCHA is greatly improved by the method in this paper.(3)An end-to-end Chinese CAPTCHA recognition network without character segmentation is proposed.The method of first segmentation and then recognition is only applicable to CAPTCHAs that can be easily segmented.When facing CAPTCHAs with sticky or overlapping characters,the recognition accuracy will be greatly reduced due to the poor character segmentation effect.Therefore,this paper proposes an end-to-end recognition network,which can feed the original image of CAPTCHA directly into the model for training and recognition.The network uses an improved ResNet-18 structure as the feature extraction module,and since Chinese CAPTCHAs usually consist of multiple Chinese characters and are sequential in nature,the Transformer is used as the codec module to capture the sequence information of the extracted features,as well as to encode and decode the sequences,and to improve the recognition accuracy of the model by improving the loss function.This network supports end-to-end learning,avoids the effect of character segmentation,and has good generality and robustness.
Keywords/Search Tags:CAPTCHA recognition, image processing, convolutional neural network, Transformer mechanism
PDF Full Text Request
Related items