| With the development of Internet technology,images have become an important medium for people to communicate.Recognizing text accurately in images has an essential impact on artificial intelligence applications in machine vision and other fields due to its precise semantic information.Although Optical Character Recognition technology has been studied for decades and has made significant progress.There are many complex texts in the real application scenarios,such as automatic driving,navigation for the blind,and automatic bank check processing.These scene texts have serious irregularities,including rich layout arrangements,disturbing backgrounds,various writing styles of handwritten text,touching of adjacent handwritten characters.And because the equipment and collection methods are unprofessional,the captured image is blurry and low-resolution.Complex scene text recognition is still a challenging task.In this paper,the recognition methods for two types of complex scene text are studied.The main research contents and results are as follows:(1)For scene text images,a scene text recognition model based on text attention and semantic enhancement is proposed.First of all,a visual feature extraction module based on text attention is designed for the background interference problem.The full convolution architecture is used to pixel-level predict for the image to realize the attention mechanism for text.The model can adaptively suppress the interference of background features,extract more effective foreground features.Then,a rotation rectified network is designed to solve the problem that the sequence-based scene text recognition method cannot handle the vertical text.The model generates a rectified scheme by predicting the arrangement direction and reading order of the text in the scene image and rectifying the extracted two-dimensional visual features.Finally,a semantic enhancement model combining temporal convolution and Transformer encoder is constructed,which effectively improves the recognition accuracy of text images with low resolution and serious noise,and the model runs parallelly.The experimental results show that the recognition accuracy of the scene text recognition model based on text attention and semantic enhancement greatly exceeds the benchmark model,and the accuracy is improved by more than 4% on multiple datasets.(2)We study the recognition of handwritten legal amounts in bank checks using the segmentation-based framework of text recognition,and a recognition algorithm for handwritten legal amounts based on finite-state automaton is proposed.First,an automaton for grammar check is constructed by classifying the characters and analyze the grammatical logic.Then,the automaton is applied to optimize the performance of paths search and reject grammatically incorrect recognition results.Finally,for the problem of missing strokes of text in low-quality bank checks,the constructed automaton is used to realize the characters infer algorithm.The experimental results show that the accuracy of line recognition of the legal amounts can achieve 96.6% with the grammar check and prediction of automaton. |