Font Size: a A A

The Research On Scene Text Detection Based On Deep Learning

Posted on:2021-05-08Degree:MasterType:Thesis
Country:ChinaCandidate:X X LiFull Text:PDF
GTID:2428330632462922Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Images play an essential role as the medium of information transmission.The detection and recognition of the text content in the pictures will help to speed up the spread of it among people,making the information transmission more efficient and convenient.In the field of computer vision,natural scene text detection is a hot and challenging research point.In many cases,the text information contained in the pictures themselves is an important factor to understand the content of scenes.Therefore,the more accurate the text detection of the natural scenes is,the more accurate our semantic understanding of the pictures will be.At present,there are some shortcomings in CTPN proposed by previous researchers,such as the precision of text detection is not high,and only the scene text in the horizontal direction can be detected.In this paper,based on the further study and improvements of CTPN,a new scene text detection model is proposed.The main content and achievements of this paper are as follows:1.Based on the strong feature extraction ability of ResNet,we use ResNet instead of VGG16 in the original network to extract deeper image features,which makes the extracted image features more semantic.Besides,we also use the technology of multi-layer network feature fusion to make the extracted image features more abundant.2.In some models based on deep learning,attention mechanism is often used together with RNN,so as to distribute more reasonable weights of context sequences.In this paper,after the bidirectional LSTM,attention is introduced to better learn the context relationship between various proposals to get more accurate scene text detection results.3.Due to the limitation of the model itself,the original CTPN can only connect various sizes of proposals in the horizontal direction.In this paper,the least square method is used to improve the text line construction method of the original network,so that the direction of scene text detection can be expanded from horizontal to oblique.The main content of this paper is the optimization and improvement of the original CTPN,including:the upgrading of feature extraction network,the use of multi-layer network feature fusion,the application of attention mechanism and the expansion of CTPN scene text detection direction from horizontal to inclined.Several groups of comparative experiments in the paper indicate that the overall performance of the improved network is better than the original CTPN.
Keywords/Search Tags:natural scene text detection, ResNet, multi-layer network feature fusion, attention
PDF Full Text Request
Related items