Font Size: a A A

Research On Retrieve And Text Detection Of Document Image

Posted on:2018-04-02Degree:MasterType:Thesis
Country:ChinaCandidate:M F LiFull Text:PDF
GTID:2348330533969806Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The development of Internet and multimedia technology has spawned the need for efficient management of massive multimedia data,and the document image,as a special data form carrying a large amount of text information,has also promote the rapid development of retrieve algorithms for more than 40 years.In this paper,the realization of document image retrieval system as the core content,for image preprocessing,text detection,feature extraction and retrieval were explored.Based on the implementation,we used the pixel-level operation to remove the noise and highlight the foreground information,compared the SIFT,HOG,LBP features on the text detection and image retrieval,and we focued on docume nt text detection and content-based retrieval system to carry out research,proposed a deep network based on RPN combined with LSTM for the detection of text,also,we achieved a document image retrieve system.(1)Text detection.The MSER,SWT,ER Filter and other algorithms are difficult to achieve effective balance in the recall rate,boundary accuracy and robustness.On the basis of the target detection algorithm based on the depth learning,this paper used LSTM extract the context information of the text area,and optimizes the target detection algorithm according to the customization of the RPNanchor box on the scale,aspect ratio and the offset on the vertical height,reducing the text of the false positioning and low contrast text area of the missing.(2)Document image retrieval system.Based on the LLAH method,this paper improved the issue of invalid eigenvalue in the original algorithm by using the position optimization and the increase of the number of feature points.By extracting the LBP feature,no additional extra resources are added Based on the results of the retrieve results on the LLAH system to be filtered;this paper also extracted the text line area HOG features,combined with LSH hash search,in the small-scale data set on the efficiency and accuracy are to meet the existing needs.
Keywords/Search Tags:image retrieval, text detection, locally likely arrangement hashing, region proposal network
PDF Full Text Request
Related items