Text Image Recognition Base On Diverse Data And Weak-Supervised Learning

Posted on:2023-10-05

Degree:Doctor

Type:Dissertation

Country:China

Candidate:C J Luo

Full Text:PDF

GTID:1528306830481994

Subject:Information and Communication Engineering

Abstract/Summary:

PDF Full Text Request

Text image recognition is an essential issue in computer vision tasks.Many practical applications such as intelligent traffic,product recognition,and image inspection,benefit from the rich semantic text information.Therefore,text image recognition has emerged at the forefront of this research topic and is regarded as an open and challenging research problem.Nowadays,models for regularly printed text recognition have achieved notable success.Nevertheless,most current recognition models remain unstable to handle multiple disturbances from the environment,such as various shapes of irregular texts and complex noise of backgrounds.Simultaneously,recognizing and processing a plethora of distinct individual handwriting styles remains a great challenge.Recently,deep learning-based data-driven approaches become dominant.Improving the performance of these approaches typically requires collecting and annotating large-scale text images for model training.However,it is quite time-consuming and labor-intensive.This thesis studies text recognition from a novel perspective of making better use of largescale data.We increase the data diversity by performing data augmentation and data synthesis to improve the robustness of recognition models.We decrease the dependencies of full annotations by using data in an adversarial-/weak-/self-supervised manner to make full use of data.The research of this paper is carried out in the following three aspects:(1)We tackle the problem of insufficient diversity of training samples by proposing a smart data augmentation approach for more effective and specific training data.Moreover,we tackle the problem of numerous handwriting styles by proposing a handwriting synthesis approach.By adjusting style parameters and content conditions,we can synthesize high-quality handwritten text images with diverse styles and rich vocabularies.Experiments show that data augmentation and synthesis significantly enrich the training samples and improve the robustness of the recognition model.(2)We tackle the problem of insufficient model generalization performance in the wild by focusing on irregular shapes and complex background noise.We propose a weak-supervised multi-object rectification model and a self-aligned adversarial denoising model.They are j ointly trained with recognition models,using only text labels as the supervision to rectify irregular shapes and remove background noise.which significantly reduces the difficulty of text image recognition and improves the performance of the recognition model.(3)We tackle the difficulty of using large-scale unlabeled data by utilizing the unique property of text images and rethinking the solution to the issue from a novel perspective,rather than directly adopting mainstream contrastive learning approaches.Typically,the neighboring image patches among one text line tend to have similar styles,including the strokes,textures,colors,etc.Motivated by this observation,we propose a self-supervised representation learning scheme using similarity-aware normalization.We make use of the correlation among one text line to recover an augmented patch by using its neighboring patch as guidance.The decoupling and ensemble of content and style improve the representation quality.Moreover,the self-supervised generative model achieves encouraging performance on extended tasks such as data synthesis,text image editing and font interpolation,suggesting a wide range of practical applications.This paper proposes several approaches and ideas for text image recognition in the era of big data.We hope these approaches could arouse the rethinking of the use of data in the field of text recognition.

Keywords/Search Tags:

Deep learning, optical character recognition, data augmentation, data synthesis, background noise removal, representation learning, generative adversarial network, weak-supervised, self-supervised, artificial intelligence

PDF Full Text Request

Related items

1	Research Of Self-supervised Representation Learning Based On Generative Adversarial Networks
2	Research On Semi-Supervised Network Traffic Classification System Based On Deep Learning
3	Research On Object Detection In The Wild Based On Deep Convolutional Neural Network
4	The Optimization Of Self-supervised Generative Adversarial Nets
5	Data Augmentation Based On Generative Adversarial Networks
6	The Research Of Feature Representation And Face Recognition Algorithm Combining With Facial Attribute Based On Deep Learning
7	Application Research On Semantic Recognition Of Questions Oriented To Purchasing Service Robot
8	Research On The Feature Selection Techniques Based On SVM For Network Data
9	Research And Application Of Semi-Supervised Deep Contrast Learning
10	Study Of Semi-supervised Soft Sensor Modeling Based On Deep Network