Font Size: a A A

Research On Scene Text Detection Via Feature Fusion

Posted on:2020-12-04Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y LuFull Text:PDF
GTID:2428330623966989Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Text detection in scene images is a significant research in computer version.The text detection task is challenging due to the complexity of scene,the diversity of texts and the quality of the images.This thesis focuses on the research and improvement of the scene text detection based on deep learning,proposes a multi-scale and arbitrary-orientation scene text detection method.The main research contents and innovations are as follows:(1)Research on feature extraction based on multi-scale feature fusion.Aiming at the problem of the inadequate utilization information of the hidden feature layer of convolutional neural networks in the stage of text feature extraction,we propose a feature extraction method based on refined fusion of multi-scale feature and the basic model is Feature Pyramid Networks.In our model,a Global Average Pooling layer is added to the ResNet50 for pre-training to obtain more global information and reduce over-fitting,a refined fusion network called RefineNet with inputs of multi-scale feature is used as the lateral connection module of FPN to obtain more complementary feature representation model for the subsequent detection task.Experiments show that the method based on refined fusion of mulit-scale feature can improve the accuracy of detection and provide a technical support for feature extraction of subsequent text detection.(2)Research on horizontal text detection method based on differentiated text scales.Aiming at the problem of the difficulty to adapt multi-scale text detection from single-size feature maps,we propose a horizontal text detection method with differentiated scales,which considers that the scales of the feature maps have different sensitivity to different scales of texts.Meanwhile it divides the texts into three scale ranges by the longer side of the text label boxes,distributes the label boxes in different scales into three Region Proposal Networks for training,merges the output of the detection results of the next three prediction networks and removes duplication to obtain a hierarchical detection model which is adapted to the text with larger scale range.Experiments show that the method of differentiated scales can improve the accuracy of horizontal text detection.(3)Research on arbitrary-orientation text detection method based on focal loss of hard samples.In view of the problem that the horizontal detection box is difficult to fit the arrangement orientation of ground-truth box and the feature is rorational,we add an angle channel of rotational symmetry to the regression network of horizonal text detection model with differentiated scales and realize text detection in arbitrary orientations.In addition,aiming at the problem of insufficient learning of hard samples in the process of model training,we extend the hard samples problem to hard classification and hard regression,design the focal loss functions of hard samples separately and focus on learning the feature of hard samples to improve the accuracy of classification and the accuracy of location of the model.Experiments show that the method based on adding angle training and focusing on the loss of hard samples can improve the accuracy of text detection in arbitrary-orientation.
Keywords/Search Tags:Scene, Text detection, Feature fusion, Differentiated scales, Focal loss
PDF Full Text Request
Related items