Font Size: a A A

Study On Completely Automated Public Turing Test To Tell Computers And Humans Apart (CAPTCHA) Recognition Based On SSD

Posted on:2019-02-12Degree:MasterType:Thesis
Country:ChinaCandidate:J J LuoFull Text:PDF
GTID:2428330566477354Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
This paper mainly studies the process of recognition CAPTCHA.In order to prevent malicious password cracking,forum flooding,brushing,and swiping,a verification code was created.With the increase of the network usage,the amount of data of the verification code also becomes extremely large,which makes the production demand of the verification code label increase.In the past,the label of the verification code has always been manually marked.It can be imagined that the speed is not very fast and the cost is not low.At this time,there is an urgent need to use a computer to recognize the verification code.The recognition of the CAPTCHA is generally divided into three steps: positioning,segmentation and identification.In these three steps,whether or not the positioning is accurate plays an important role in identifying the verification code.In the verification code image,character localization is generally performed using conventional methods such as K nearest neighbor classifier,BP network,SVM,etc.These methods are all based on statistical methods,and these algorithms are used to recognise verification code images with low difficulty.It performs well,but when it comes to character conglutination,it appears to be underpowered,so it requires a more accurate method of positioning.This article uses the SSD(Single Shot Multi-Box Detector)deep learning network,which is an object detection network model proposed by an article in the ECCV2016(European Conference on Computer Vision),which guarantees training speed.In this case,the test accuracy is guaranteed.After determining the positioning method,the next step is data preprocessing.The data used in this paper comes from a total of 546,423 verification code images provided by the customer.Through the interpretation of the image tag information,the image data can be sorted and filtered,and then the random sampling is used.Methods 30,000 primaries were selected,then the images were filtered,20000 images were collected as data sets,and the data set was numbered.The data set was divided into training sets and test sets by 9:1,and there were 18,000 training sets and 2000 test sets,and finally this data set can be used to locate and identify training data sets.Implementation of positioning and recognition algorithms,that is,implementation of SSD network training and recognition algorithms,requires the preparation of data sets and compilation of caffe in advance.After training starts,there will be some minor problems in the training process.These need to be carefully To check,the most important thing is whether the path in the network file is correct or not,and whether the parameters are set correctly,and it is necessary to adopt an appropriate strategy for training to converge faster.In the past,it was artificially identified,slow,and costly.In this paper,computer recognition is fast and efficient,and the corresponding cost is low.The accuracy rate of manual coding is less than 85%,and the customer's requirement is 85%.It is hoped that the recognition accuracy rate can be higher than that of artificial ones.The accuracy rate of using the method in this paper reaches 85.796%.Localization and segmentation methods have traditionally been used in traditional algorithms such as SVM.This paper uses deep learning SSD networks,which is more convenient and accurate than traditional methods.Deep learning has been applied to all walks of life.The future world will develop network intelligence.Deep learning is a very good way to realize artificial intelligence.Therefore,research of deep learning needs further development.
Keywords/Search Tags:CAPTCHA Recognition, Character Positioning, SSD Network, Data Preprocessing
PDF Full Text Request
Related items