| Texts can be widely found in natural scenes,and these texts often contain important semantic information,such as road names on street signs,readings on industrial meters and product's names on billboards.Proper detection and recognition of texts in natural scenes is of great value in understanding the content of the scene.Text detection and recognition in natural scenes has many application scenarios,such as image and video retrieval systems,picture control,automatic entry of bills and etc.Scene text detection and recognition has received great attention in the academic community in recent years.Unlike traditional optical character recognition,texts in the natural scene may have different color,size and has a variety of layouts.In addition,the background of the scene texts is highly complex and it faces various problems such as low resolution,noise,and occlusion.All of these above make natural scene text detection and recognition a challenging task.This paper studies the problem of text detection and recognition in natural scenes.This paper discusses more concise algorithm design to further improve the detection performance and accelerate the detection speed.Three novel scene text detection algorithms based on deep learning are proposed and their application scope is discussed.This paper discusses more robust and efficient scene text recognition algorithm,and proposes a novel scene text recognition algorithm based on deep learning.This paper verifies the algorithms in standard datasets and practical engineering applications.Specifically,the research in this paper mainly includes the following contents:1.A novel algorithm for fast text detection based on multi-scale feature fusion: Previously,scene text detection algorithms based on semantic segmentation have a complicated pipeline and post processing.Therefore,this paper proposes a novel algorithm for fast text detection based on multi-scale feature fusion.It is based on semantic segmentation and direct regression.The proposed algorithm's pipeline is simple.It's consists of a full convolutional neural network and a standard non-maximum suppression.A variety of designs are introduced during the design of the network structure to reduce the amount of parameters and speed up the detection.The proposed algorithm achieves state-ofthe-art performance on several scene text detection datasets and is capable of running at 11.1 frames per second on 720 × 1280 images,much faster than previous algorithms.2.A novel long text detection algorithm based on endpoint detection: Aiming at the problem of poor performance of long text detection caused by limited network receptive field,based on the previous method,a novel long text detection algorithm based on endpoint detection is proposed in this paper.The proposed algorithm introduces the idea of texts' endpoint detection in architecture design.It uses a boundary generation algorithm as post-processing,which can avoid the problem of inaccurate long text boundary prediction due to limited network receptive field.The results on several public datasets fully validate the effectiveness of the proposed algorithm.Both on datasets with great quantity of long texts and normal datasets,the proposed algorithm achieves state-of-the-art performance,further improving the performance of previous scene text detection algorithms.3.A novel algorithm for irregular scene text detection based on progressive expansion:In order to detect irregular texts,based on the backbone of the above algorithms,this paper proposes a novel algorithm which can detect irregular scene text.The proposed algorithm is an instance segmentation algorithm by gradually expanding the segmentation results.The test results on the public dataset fully verify the effectiveness of the proposed algorithm.4.A novel algorithm for scene text recognition based on double transcription layers mechanism: Aiming at speed up training process and improve the robustness of recognition results of existing scene text recognition algorithms,this paper proposes a novel algorithm for scene text recognition based on double transcription layers mechanism.The proposed algorithm is based on the basic architecture of convolutional neural networkrecurrent neural network.The context information and character information of scene text is modeled separately by introducing an additional transcription layer.The proposed algorithm can be trained end-to-end,with fast convergence speed and excellent recognition performance.It achieves state-of-the-art performance on several scene text recognition datasets. |