Font Size: a A A

Research And Implementation Of Irregular Scene Text Detection And Recognition Integrating Represent-Ation Learning

Posted on:2022-06-22Degree:MasterType:Thesis
Country:ChinaCandidate:G Y DengFull Text:PDF
GTID:2518306338970539Subject:Electronic Science and Technology
Abstract/Summary:PDF Full Text Request
Irregular scene text detection and recognition is a key technology in the field of computer vision which is a popular research topic,and plays a vital role in machine navigation,image retrieval,scene text understanding,realtime translation and industrial automation.The process of a typical text recognition system is as follows:first apply text detection algorithm to locate each text instance in the image,and then recognize the text instance through the text recognition algorithm.The text detection algorithm uses high-resolution images as input,and its accuracy and speed will affect the performance of text recognition significantly.Meanwhile,the visual diversity of scene texts is a great challenge for robust text recognition.Therefore,there are three remaining difficulties:(1)Recent text detection network has low computational efficiency which makes it difficult to balance the inference efficiency and accuracy;(2)The imbalance of text category causes poor robustness of text recognition in complex scene;(3)The irregular scene texts challenge the generalization capacity of text recognition model.The above difficulties limit the performance of the text recognition system and hinder the application.Thus,this paper focuses on text detection network,text representation learning and irregular text rectification.The main work is as follows:1.Recurrent progressive segmentation network for scene text detection.In view of the problem that the existing text detection methods are difficult to balance the inference efficiency and the detection accuracy,the main reason is that the text detection model is inefficient in learning text representations.The recent text detection methods applied feature pyramid network(FPN)to enhance multi-scale semantics but ignoring the efficiency,leading to large redundancy on computational cost.This paper proposes a text detection algorithm based on recurrent progressive segmentation.First,the progressive constraint mechanism is introduced to ensure the enhancement of semantic features in the stacked feature pyramid networks.Then the semantic constraint mechanism is applied to the middle layer features to guide the network to extract more accurate text semantics.Experiments show that the proposed method improves the detection accuracy by 2.0%within the limited parameters and computational cost,and outperforms other recent text detection networks on balancing the inference efficiency and accuracy.Comparison results on four public datasets show that the proposed text detection network also outperforms most recent methods.2.Robust representation learning for scene text recognition.Aiming at the generalization performance of text recognition model in complex scene,we find that the existing data is category imbalanced and the amount of different characters varies greatly.Thus it is difficult for the model to learn robust text representations and then causes accuracy decline.This paper firstly proposes a text representation network based on coordinate encoding to describe the spatial semantics of character strokes,and then proposes an encoder-decoder-based representation learning objective function which integrates the category correlation to constrains the intra-class consistency and inter-class distinction of the feature space.Experimental results show that the proposed method alleviates the category imbalance problem and learns robust text representations which improves the recognition accuracy by 3.0%in complex scenes and by 1.0%in easy scenes,outperform other recent text recognition methods.3.Siamese network for irregular text recognition.For the irregular text recognition problem,we find that the existing training data lacks of irregular text samples which makes it difficult to learn the rectification of severely deformed text images and brings poor recognition accuracy on irregular texts.This paper proposes a novel text rectification algorithm based on the Siamese network which changes the training procedure of text rectification network.First,we use random affine transformation to augment the training data and construct image pairs,and then use the siamese network to learn the affine-invariant text rectification ability from the image pairs.Experiments on irregular text datasets show that the proposed method improves the recognition accuracy of irregular text by 2.8%which correctly rectifies severely curved and the vertical text.Compared with recent irregular text recognition methods,proposed algorithm achieves a competitive irregular text recognition performance.In summary,this paper builds an irregular scene text detection and recognition system with the capabilities of efficient text detection,robust text recognition and irregular text rectification.In the end,a feasible demonstration system for irregular scene text detection and recognition is designed and implemented.
Keywords/Search Tags:scene text detection, scene text recognition, feature pyramid network, representation learning, irregular text rectification
PDF Full Text Request
Related items