Font Size: a A A

Research On Retrieval And Text Detection Of Non-plain Text Document Image

Posted on:2020-05-25Degree:MasterType:Thesis
Country:ChinaCandidate:J ZhouFull Text:PDF
GTID:2428330602952430Subject:Engineering
Abstract/Summary:PDF Full Text Request
Rapid growth of digitized literature puts higher demands on document image retrieval.Traditional document image retrieval methods rely too much on complex OCR and text similarity detection,while content-based image retrieval avoid the shortcomings of traditional methods.This method is useful for detecting duplicate texts and repeated publications in academic journal thesis databases,and also for querying relevant documents in massive resources.The traditional text detection algorithm is not robust enough,and the text detection recall rate is low in the complex background and the text detection rate is low in multiple directions.In the field of image retrieval,the search performance is poor in a special image with less text and mixed text information with graphics.In view of this problem,this thesis starts from content-based document image retrieval,and focuses on text detection and retrieval system:Text detection.In this thesis,the method of combining MSER segmentation candidate region and SWT calculation pixel value is used to realize multi-scale text detection and improve the recall rate.Aiming at the inefficiency of this traditional method in complex background or long text scene,an improved text detection algorithm based on Faster R-CNN target detection algorithm is proposed.The algorithm uses the contextual strong association feature of the text area to join the LSTM network,and keeps the relationship between the text context sequences.By adjusting the size of the RPN anchor box,the problem of long text misposition and incomplete detection is solved.Aiming at the scenes where both long text and oblique angle exist simultaneously,an improved text detection algorithm based on FCN is proposed.This algorithm uses different size feature maps of different layers of FCN to reduce the number of channels,reduce the amount of calculation and determine the text area of rectangular geometric to determine the text area head to tail,which improves the performance of direction more long text detection and reduce the testing time by 50%.Document image retrieval framework.This thesis establishes a basic framework for document image retrieval based on off-line CNN feature extraction and online cosine similarity matching.The framework of this thesis consists of four parts: preprocessing,image feature extraction,image index establishment and online similarity matching.After the pre-processing operation,the image is divided into two parts: a text area and a non-text area.Then the thesis focuses on CNN feature extraction fusion and index establishment.Through the migration learning,multiple CNN models pre-trained in the Image Net2012 dataset are introduced into the system,and fine-tuning in the document image dataset to adapt to the document image retrieval task.At the same time,since the features extracted by the convolutional neural network are high-dimensional,in order to reduce the computational cost in the retrieval and storage process,PCA feature dimension reduction is adopted.Then,in order to more fully model the image and improve the retrieval precision,this thesis proposes a multi-model fusion strategy that can improve the accuracy of learning tasks--Rank weighted fusion feature strategy.Finally,an inverted index is built based on the visual vocabulary feature BOW to optimize the time loss of the search.In this thesis,these methods are integrated into the framework,which greatly improves the accuracy of the system and reduces the retrieval time.In the 20,000 document image data sets of multiple languages,the MAP of this system has increased to 85%,and the retrieval time has been reduced by 27%.
Keywords/Search Tags:Text Detection, Image Retrieval, Multi-Model Feature Fusion, Inverted Index, Transfer Learning
PDF Full Text Request
Related items