Pdf Images Pre-processing For Formula Recognition

Posted on:2013-08-18

Degree:Master

Type:Thesis

Country:China

Candidate:S L Liu

Full Text:PDF

GTID:2298330362964325

Subject:Computer application technology

Abstract/Summary:

With the development of the information age, the amount of information is continuouslyexpanding. How to quickly access information, is one of the urgent needs of todayâ€™s society.In view of this, in this paper we presented some PDF images pre-processing work for formularecognition, and laid the foundation for PDF formula recognition and retrieval in the future.Firstly, parsed the PDF documents and extracted PDF images; Secondly, according to theimages width and height to filter the small images, according color information of true colorimages, histogram information of gray images and black pixels of binary images to classify,and obtained text images for character recognition. Finally, pre-processed the PDF imagesand formula regions, such as according to stroke width to determine and enhance thelow-resolution images, according to circle of the special characters locate formula regions,according to angle of the longest symbol straight line to correct formula regions andaccording to the formulaâ€™s characteristices to denoising. The experiments showed that thesepre-processing can help to improve the recognition accuracy of PDF image formula.

Keywords/Search Tags:

PDF documents, Classifying PDF images, Enhancing imagesLocating the formula regions, Formula recognition

Related items

1	Research On Technology Of Optical Formula Recognition
2	Research And Implementation On Detection And Recognition Algorithm For Mathematics Formulas In Documents
3	Method Research On Printed Mathematical Formula Recognition
4	Mathematical Formula Extraction In Printed-Chinese Documents Based On EEN Feature Function
5	System Of Mathematical Formula Recognition In Printed Chinese Documents
6	Research On Key Issues Of Printed Mathematical Formula Recognition
7	Research On The Mathematical Formula Recognition Technology For Printed Document
8	Automatic Recognition Of Basic Formula Under Mobile Phone Photographing
9	Mathematical Formula Recognition In Typeset Chinese Documents
10	Recognition And Index System Of Math Formula Based On Deep Learning