Font Size: a A A

Research On Detection And Recognition Method Of Mixed Font Text

Posted on:2022-02-10Degree:MasterType:Thesis
Country:ChinaCandidate:W L ZhangFull Text:PDF
GTID:2518306731977669Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Since its inception,writing has always been a symbol of the continuation of human civilization,and it has also been an important carrier for providing information in the process of communication between people.In recent years,with the continuous rise of deep learning,researches on text recognition related technologies have emerged one after another.Its purpose is to obtain character text through machine translation from text in documents or natural scenes.To a certain extent,a singl e type of text recognition technology has matured.However,in many real-world applications,there is also a situation where handwritten and printed texts need to be detected,recognized,and classified together.We call this mixed text.Such as: all kinds of corrected test papers,contract files with handwritten signatures,files used for business records,etc.At present,the mainstream text detection network can only locate the text,but cannot perform subsequent classification of the text,so it needs t o be classified after the detection.This approach will increase the complexity of the algorithm and affect the efficiency of text detection and classification..The diversity of real scenes,the high similarity of handwritten text and printed text,and t he diversity of handwritten text styles will also make text detection and classification extremely challenging.Based on the above observations,this paper proposes a text detection network based on bilinear pooling for the mixed detection and extraction o f handwritten text and printed text,and is oriented to the detection and recognition of mixed text.From the enhancement of training data,text detection,Research work is carried out in three aspects of text recognition.The research work of this article is as follows:1.In the training data enhancement part,in response to the lack of public data sets for mixed text,this paper proposes the use of text mapping and Scrabble GAN-based methods to generate text detection training data;the use of image enhancement methods based on joint learning to enhance text recognition training data.A total of 2000 pieces of text detection training data and more than 800,000 pieces of text recognition training data are generated.Lay the data foundation for the subsequent text detection and recognition work.2.In the text detection part,it is one of the main innovations of this paper to jointly detect printed and handwritten texts and achieve the classification effect in the detection stage.In this part,this paper pro poses an end-to-end hybrid text detection network.The network uses the SPP+PANet structure to extract deep features and perform feature fusion.The bilinear pooling module is used to extract high-discrimination features.Make a good distinction between ha ndwriting and printing.Experiments on the mixed text data set have achieved 91.2% accuracy,93.4%recall and 7.8 FPS,verifying the excellent performance of the network.3.In the text recognition part,firstly,this paper uses CRNN text recognition network to train two text recognition models to recognize handwritten and printed text respectively.Then,in response to the lack of Chinese character handwritten data sets,a hybrid handwritten text recognition data set is constructed.And it is verified through experiments that the self-made data set can effectively improve the accuracy of handwritten text recognition model.
Keywords/Search Tags:Image generation, Improved text detection network, Text recognition, End to end
PDF Full Text Request
Related items