Font Size: a A A

Research On Text-based Captcha Recognition Technology Based On Image Processing And Convolutional Neural Network

Posted on:2021-03-30Degree:MasterType:Thesis
Country:ChinaCandidate:X Q HuangFull Text:PDF
GTID:2518306476950879Subject:Cyberspace security
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet technology,various cyberspace security issues have become increasingly prominent.CAPTCHA(Completely Automated Public Test to tell Computer and Humans Apart)is one of the important methods often used to maintain the security of human-computer interaction in the Internet.It can distinguish the current user from a human user or a virtual user simulated by a computer through a simple Turing test.Text-based CAPTCHA is one of the most widely used CAPTCHA,which is widely used in website login and other links that need to ensure the safety.Research on Text-based CAPTCHA recognition technology is one of the research hotspots in cyberspace security.Up to now,domestic and foreign researchers have proposed a variety of text-based CAPTCHA recognition methods which may have problems such as large experimental data sets or complex classification models,and generally require manual participation in preliminary work which leads to poor versatility.Especially for character sticking,distortion,deformation and other special situations,the recognition effect is not satisfactory.In view of the above problems,image classification and computer vision techniques are used to research text-based CAPTCHA that have anti-segmentation processing such as character adhesion,distortion,and deformation.The main work and innovations of this article include the following aspects:(1)Preprocessing and character segmentation of text-based CAPTCHA pictures are based on digital image processing.In the character segmentation stage,a pre-segmentation operation is added,which is performed according to the actual characteristics of each image,using a series of analysis such as projection analysis,color filling algorithm,character width and other prior knowledge,to determine to use the connected domain segmentation or the drop-fall algorithm to segment.For the first one,a single connected domain or a combination thereof obtained by the color filling algorithm is mainly extracted as individual characters.For the second one,the trajectory of the water droplet trajectory is used as the segmentation curve.The starting point of the water droplet is determined by the pre-segmentation stage.At the same time,the dropfall algorithm is modified appropriately for the characteristics of the text-based CAPTCHA image.In addition,in the character segmentation phase,the starting point of the water droplet is adjusted according to the actual situation,and the algorithm is re-performed to ensure that the extracted characters are sufficiently individual and complete.(2)A convolutional neural network is constructed to train and recognize independent characters.Convolutional neural networks show good performance in the field of image classification and recognition,and text-based CAPTCHA recognition also belongs to this field in essence.In this paper,a convolutional neural network structure is designed for character features.At the same time,this paper also introduces the center loss,which can minimize intraclass differences.Traditional text-based CAPTCHA recognition schemes using convolutional neural networks do not include center losses and they can only maximize inter-class differences.The combination of them can further improve the recognition accuracy.(3)The convolutional neural network is constructed with the Tensor Flow framework,and the text-based CAPTCHA recognition system is built based on the real data set and a comprehensive experiment is carried out.The experimental results show that the character segmentation method proposed in this paper can achieve a character segmentation success rate of more than 98.5% on the experimental data set.In addition,the convolutional neural network with the center loss also has better performance in the character recognition stage.56.29% ?99.57% recognition accuracy can be achieved on the experimental data set.
Keywords/Search Tags:Cyberspace Security, Text-based CAPTCHA, Character Recognition, Convolutional Neural Network
PDF Full Text Request
Related items