Font Size: a A A

Chinese Text Recognition In Natural Scenes Based On Deep Learning

Posted on:2023-03-10Degree:MasterType:Thesis
Country:ChinaCandidate:Y JiaFull Text:PDF
GTID:2568306836974479Subject:Control engineering
Abstract/Summary:PDF Full Text Request
The history of writing can be traced back thousands of years.Text is the carrier for human beings to transmit information and inherit culture.Also,It contains rich and precise semantic information and is integrated into various scenes in real life.In today’s artificial intelligence era,natural scene text recognition has gradually entered people’s attention,and has become one of the research hotspots in the field of computer vision and pattern recognition.Currently,most natural scene text detection and recognition algorithms only train models for English characters.There are few types of English characters and simple strokes,while Chinese characters have the characteristics of various types and various combinations.Although the recognition technology of printed Chinese text is relatively mature,the Chinese text in natural scenes presents situations such as inclined angle,curved shape,indeterminate length,and various colors.Such irregular nature increases the difficulty of text detection and recognition.This thesis studies Chinese text recognition with the scene target of natural scenes.In view of the above problems,in terms of detection,a multi-scale detection method of Chinese text that integrates attention mechanism is proposed,which improves the detection accuracy of multi-directional and curved Chinese text.On the other hand,a Chinese text recognition method based on an improved CRNN(Convolutional Recurrent Neural Network)is proposed,which improves the recognition accuracy of oblique and curved Chinese texts.The main contents are as follows:(1)In this thesis,the lightweight Resnet18 is used as the backbone network of the detection model.In view of the uncertainty of the feature distribution extracted by FPN(Feature Pyramid Networks),for natural scene Chinese text extraction,a balanced attention mechanism BAM is embedded to extract effective text features and suppress inefficient feature channels.Aiming at the problem of loss of image local information and detail information when ASPP(Atrous Spatial Pyramid Pooling)downsampling,improving ASPP reduces the loss of feature map resolution.Experiments show that the above improvements effectively improve the recall and precision of detection.(2)In this thesis,in view of the insufficient feature information and small receptive field of FPN for Chinese text extraction in natural scenes,the FPN embedded with attention mechanism and the improved atrous spatial pyramid pooling IASPP parallel to enhanced feature extraction and fusion.Aiming at the problem of the imbalance of positive and negative samples,the logarithmic AC Loss is introduced into the binary graph loss based on the differentiable binarization module.Compared with the existing detection algorithms,the method in this thesis has excellent performance in detection accuracy and speed.(3)In this thesis,for the inclined and curved Chinese text in natural scenes,STN(Saptial Transformer Network)is added to the recognition framework to perform geometric transformation correction on the samples.Based on CRNN,the following improvements are made.In the feature extraction part,VGG is improved into a multi-layer residual network integrated with CBAM attention mechanism to enhance the extraction of text feature information.In the recurrent layer,in view of the complex structure of the long-short-term memory network LSTM itself,many parameters and overfitting problems,the bidirectional LSTM modeled by the feature sequence is improved into a bidirectional content adaptive recurrent unit CARU,which improves the recognition accuracy and operation efficiency of the model.In sequence decoding,the CRNN transcription layer CTC is improved to a CTC-Attention joint mechanism for training.The experimental results show that the above improvements based on CRNN optimize the decoding output of the recognition model.
Keywords/Search Tags:deep learning, Chinese text in natural scenes, attention mechanism, text detection, text recognition
PDF Full Text Request
Related items