Font Size: a A A

Research On Detection And Segmentation Of Text Information Under Complex Background

Posted on:2023-10-16Degree:MasterType:Thesis
Country:ChinaCandidate:H Y LiuFull Text:PDF
GTID:2568306908964159Subject:Engineering
Abstract/Summary:PDF Full Text Request
Text information extraction under complex background has become an increasingly popular and influential research area in computer vision.Meanwhile,the relevant achievements have a wide range of applications in many industrial fields such as industrial production,automatic driving,and information retrieval,which have significantly reduced industrial production cost and effectively promotes the sustainable development of industrial production mode towards intelligence and high efficiency.In order to improve the accuracy and efficiency of text extraction technology under complex background,this thesis conducts extensive research for text detection and segmentation algorithms,aiming to improve the performance of text extracting in complex backgrounds.The detailed contributions of this thesis are listed as follows.(1)Text instances in complex scenes are usually highly variable,such as multi-language,multi-shapes,multi-fonts.It thereby becomes a significant challenge to the features modeling and fitting ability of deep neural networks.To this end,a novel text detection method termed RIAMAF(Region Information Reassembly Text Detection based on MultiFeature Fusion)is proposed in the thesis,which is based on region information reassembly with multi-feature fusion.Firstly,the deformable convolution structure and dual attention module are introduced to improve the modeling ability for arbitrary text features in the deep neural network.An information aggregation module termed SLIA(Skip Level Information Aggregation)based on feature reassembly module termed RCA(Region Context Reassembly)is then designed to improve the picking and modeling ability of complex text feature information in the network.These methods could effectively mine and aggregate the information between different feature maps in the feature pyramid without increasing the resource occupation.Finally,the dynamic label generation mechanism is used to improve the modeling capability and generalization capability of the network.(2)Text instances often present multiple fonts and structures in complex backgrounds.In addition,multiple text instances in a single image may have significant texture differences,making it difficult to represent the text region by constructing fixed text features.To address these problems,a novel text segmentation method termed DAT-Seg(Dynamic Aware Text Segmentation based on Hard-ROI Mask)is proposed in the thesis.Specifically,a text detection branch is first introduced in the segmentation network,which enables the network automatically complete the localization of text regions and the recognition of texture structures during the online inference.Secondly,the network is further enhanced for texture feature representation and pixel segmentation by fusing the results of the text detection branch with the text segmentation branch based on hard attention mechanism.Finally,multiple loss functions are designed in the training phase of the network to enhance the learning ability of the network for the training data.(3)A mass of experiments is designed to prove the effectiveness of the proposed text detection and segmentation algorithms,and some acceleration strategies for the deep neural network model are explored in the thesis.Specifically,this thesis designs multi-link ablation experiments,multi-dataset generalizability tests and other experimental contents to verify that the proposed algorithm can effectively detect and segment text information in complex backgrounds.Moreover,the proposed algorithms are optimized in the inference stage,and the differences are compared by experiments.
Keywords/Search Tags:Text Detection, Text Segmentation, Multi-Feature Fuse, Attention Mechanism, Pixel Segmentation
PDF Full Text Request
Related items