Font Size: a A A

Research On Text Detection Algorithms In Natural Scene Images

Posted on:2021-04-06Degree:MasterType:Thesis
Country:ChinaCandidate:W B ChenFull Text:PDF
GTID:2428330611499751Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the development of Internet technology,the number of multimedia information represented by images began to explode.As a kind of data with strong semantic information,text has guiding help for unmanned driving,robot navigation,scene understanding and other applications.As a key step in text recognition,text detection is responsible for obtaining the position information of the text from the image,providing accurate positioning for subsequent text recognition.Therefore,if the text detection is not effective,it will greatly affect the performance of the entire system.Therefore,this paper mainly studies the text detection algorithm in the natural scene and improves the detection performance of the model.In recent years,with the continuous development of deep learning,the text detection method based on object detection and semantic segmentation has gradually replaced the traditional method,and the performance has been greatly improved.Text detection method based on object detection is difficult to accurately locate irregular text.The basic idea of the text detection method based on semantic segmentation is to first classify the pixel-level through the semantic segmentation network,and then reconstruct the text line based on the segmentation result.Therefore,the text detection method based on semantic segmentation can effectively detect irregular text.Considering that there are more irregular texts in the natural scene,this paper mainly focuses on semantic segmentation,and Pixel Link is used as a benchmark model for related research.In response to the missed detection of the benchmark model,this paper introduces a multi-level feature fusion module.In response to the misdetection problem of the benchmark model,this paper introduces a dual attention mechanism to effectively suppress Background interference;during the training phase,the dataset only provides the corner information of the bounding box,which is error for the semantic segmentation task.Moreover,the corresponding label is one-hot encoding,which is prone to overfitting in combination with cross-entropy loss.In view of this problem,this paper introduces a label smoothing mechanism to improve the generalization of the model.The datasets used in this paper are ICDAR2013 and ICDAR2015.The experimental results verify that the improved Pixel Link in this paper has a certain improvement.Compared with the benchmark model,the F1 value has improved by 2.3% on ICDAR2013 and 2.5% on ICDAR2015.
Keywords/Search Tags:text detection, multi-layer feature fusion, dual attention mechanism, label smoothing
PDF Full Text Request
Related items