Font Size: a A A

Research On Universal Recognition Algorithm For Text-based CAPTCHA

Posted on:2017-07-16Degree:MasterType:Thesis
Country:ChinaCandidate:Y H XieFull Text:PDF
GTID:2348330512464446Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet, the network has become the channels to access to vast amounts of information. Today, the network application has been involved in all major areas of communications, finance, news, and network security has become an inevitable issue with great significance in every field. CAPTCHA is a method of the Turing test, which can distinguish the different responses of human and computer program, block malicious computer programs for network attacks, and protect network security. There are many kinds of CAPTCHA, and text-based CAPTCHA is the most widely used because it has relatively good performance on user experience, security, system complexity and economic costs.The main research work and contributions are as follows:1) This paper studies the traditional segmentation algorithm and recognition algorithm of CAPTCHA, which finds that the traditional methods rely on the priori knowledge of CAPTCHA, for example, the number of characters, character width, the stickiness between characters, and so on. Meanwhile the traditional segmentation algorithm cannot effectively segment sticky and distorted characters, and traditional recognition algorithm process may cause error accumulation because of the character segmentation error. Above all, these traditional methods cannot be universally used.2) For the traditional character segmentation algorithm relying too much on the priori knowledge of CAPTCHA and cannot segment sticky and distorted characters, we present a CAPTCHA character segmentation algorithm based on the character skeleton. This algorithm analyzes the sticky and distorted character skeleton, pinpointing and cutting character segmentation points on the character skeleton. Finally we can restore the single character from segmentation character skeleton. This algorithm solves the limitation and deficiency in traditional algorithm and become more general.3) For improving the process of traditional recognition algorithm, we present a CAPTCHA assembly recognition algorithm based on the character skeleton point. This algorithm uses character blocks and block assembly instance of character segmentation, meanwhile combines with the optimal search to filter results from all assemblies. So the problem of error accumulation caused by character segmentation in traditional algorithm is solved. And our algorithm innovatively uses character skeleton point to cut character into blocks, which solves problems of overwhelming character fragments during the character blocking process and incompleteness during the assembly process. Experimental results show that the recognition rate of our algorithm is better than that of the traditional algorithm. Because our algorithm does not depend on the prior knowledge of CAPTCHA, it is more general in recognition of the CAPTCHA.
Keywords/Search Tags:CAPTCHA recognition, character segmentation, character block, assembly recognition, optimal search
PDF Full Text Request
Related items