Font Size: a A A

Research On Industrial Character Segmentation And Recognition Under Complex Background

Posted on:2021-04-18Degree:MasterType:Thesis
Country:ChinaCandidate:H Y WuFull Text:PDF
GTID:2428330629953111Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The technology of Optical Character Recognition(OCR)was started in the 1960 s.From the simple printed text to the Character Recognition in various complex scenes,it has attracted more and more attention.As industry enters the 4.0 era,industrial production is upgraded to a highly digital and intelligent mode of production.Information technology combines entity and network to inject new vitality into industrial production.Automatic character recognition has become one of the research hotspots in industrial production.Industrial production requires that product information on the production line can be quickly detected and recognized,but various factors interfere in the process,such as noise in the production environment,lighting conditions and the diversity of industrial characters,which all bring certain difficulties to character recognition.In view of the difficulties faced by industrial character recognition at present,this paper studies the recognition of single character and whole character sequence respectively based on the traditional character recognition process and the deep learning character recognition process.The specific work includes the following aspects:(1)In the traditional character recognition process,a character segmentation method based on connected domain and geometric features is proposed to solve the segmentation problems of structural discontinuity and adherent characters.Before the character segmentation,Blob analysis is used to remove the interference information in the image and obtain the character area.The tilted characters is correct through elliptic fitting and affine transformation.When using a segmentation method based on connected domains and geometric features,first perform morphological filling and connectivity analysis for discontinuous characters,and then obtain the first rough segmentation result according to the connected domains of characters.The second segmentation searches for each segment based on the rough segmentation.The minimum circumscribed rectangles of the connected domains are divided into equal distance rectangles by the width and height characteristics of the initial characters,so as to obtain a single character.Finally,this method is used to segment the dot-matrix character images collected in this paper and use K nearest neighbor,Support vector machines and multilayer feed-forward neural networks verify the effectiveness of segmented characters.Experimental results show that the character segmentation method based on connected domains and geometric features can segment discontinuous or sticky characters more effectively than traditional methods.(2)An end-to-end character recognition network based on the combination of CRNN(Convolutional Recurrent Neural Network),CTPN(Connectionist Text Proposal Network)and attention mechanism is proposed to realize non-segmentation and multi-type character recognition for industrial characters.The difference between industrial character recognition and general document recognition is that the image background is complex,the character types are diverse,the layout is not fixed,and the noise interference is much.The traditional methods are difficult to achieve character positioning and segmentation,and this part of the work uses the current popular natural scene text recognition.The network CRNN is based on the addition of the CTPN network,which can directly detect the text area of interest in the complex image.At the same time,the CRNN introduces an attention mechanism to assign weights to the character sequence features and improve the network's attention to key information in long sequence text images.When the character sequence is too long and the character background and noise interfere,it can still maintain stable recognition.Experimental results show that the combination of CTPN and CRNN and attention mechanism has a better effect in the detection and recognition of industrial characters than a single CRNN network.(3)Aiming at the traditional character recognition process and deep learning character recognition process used in this paper,a dot matrix character recognition system based on QT and a character recognition system based on Flask Web are designed.Through the interface operation,the experimental results are more intuitive,which is more conducive to the comparison and analysis of the two methods.
Keywords/Search Tags:industrial characters, character segmentation, text detection, convolutional recurrent neural network, attention mechanism
PDF Full Text Request
Related items