Font Size: a A A

Research And Implementation Of OCR Algorithm Based On Text Knowledge Transfer

Posted on:2024-08-08Degree:MasterType:Thesis
Country:ChinaCandidate:H YuFull Text:PDF
GTID:2568306944461304Subject:Computer Science and Technology
Abstract/Summary:
With the development of deep learning,modern OCR methods are used in many scenarios,such as smartphone recognition,autonomous driving and smart education.In OCR,recognising text is usually divided into two phases,firstly the text detection phase and secondly the text recognition phase.For scenario text recognition tasks,many of the images to be recognised are blurred,interfering with the accurate recognition of the text recognition algorithm.Such text recognition algorithms rely only on a single visual information,treating text as a single symbol and ignoring the linguistic information linking the text to each other.In addition,many existing Chinese text line recognition algorithms are based on English recognition models and use Chinese datasets for training.However,this recognition method classifies Chinese characters as categories and requires a large and wide-coverage dataset.If a character is encountered in the inference stage that has not been present before,it can affect the normal recognition.The work accomplished in this paper to address the above issues is as follows:First,for the scene text recognition problem,this paper proposes a contrast learning method that incorporates visual linguistic information.The method uses a metric based on calculating the edit distance between texts to measure the similarity between texts,and improves the recognition of scene text images by optimising the contrast loss function to bring images of similar texts closer together in the representation space,while pushing those of dissimilar images farther apart in the representation space.The proposed loss function is experimented on two different recognition algorithms and six scene text datasets,and the experimental results show that the loss function has some improvement on the original method.Second,for the Chinese text line recognition problem,this paper proposes a semantically enhanced Chinese recognition method based on stroke decomposition.Chinese is composed of five basic strokes,which are:horizontal,vertical,apostrophe,down and fold.The method first decomposes the Chinese text in the image into a sequence of strokes by using the image to stroke module.As different Chinese characters may correspond to the same stroke,the method eliminates the problem of ambiguity by adding a language model to the text knowledge to obtain an accurate prediction of the final draft.To verify the effectiveness of the proposed algorithm,experiments are conducted on three different Chinese datasets,including scene dataset,web dataset and document dataset,and compared with other methods.The experiments show that the proposed method performs well on the web dataset,which has a high proportion of low-frequency characters,indicating that breaking down the text into stroke sequences is helpful for the recognition of Chinese text.Finally,the text recognition method proposed in this paper is applied to a practical project of automatic reading of student answer cards,and an answer card recognition system is designed and implemented.The system formats and displays the answer card information on the interface by recognising the name,student number,the filling in of objective questions and the score of subjective questions.The system implements these requirements through three main modules-area detection,text detection and text recognition-and designs a user-friendly interactive interface that allows the user to recognise the information on the answer card.
Keywords/Search Tags:ocr, text recognition, chinese recognition, contrastive learning, stroke
Related items