Font Size: a A A

Offline Handwritten Document Recognition System For Mobile Platforms

Posted on:2019-02-24Degree:MasterType:Thesis
Country:ChinaCandidate:J X ZhangFull Text:PDF
GTID:2428330566998565Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Offline handwritten document recognition is mainly segmenting document images,detecting table structure,recognizing table cell and extracting content containing handwritten text.This will speed up the process of converting paper to electronic document.The system is running on Android.Because the document layout is complex,it may contain a lot of noise or invalid information.Then we need to classify the document information according to the type,and get the valid information,including text,charts and so on.In order to solve this problem,this paper firstly describes the method of document preprocessing for complex background.It separates Text and non-text element,filters out the image noise,and then classifies the non-text elements on the document into different types of area.On the document images containing only non-text elements,the morphological features formed by the table elements are chosen according to the candidate element.Text element,in the bounding box of the candidate element on the text element image,is extracting to group for text lines.Closed forms and semi-closed forms,the contour lines inside and outside,are easily detected.The logical structure of the table is determined by the spacing of the text lines in the horizontal and vertical directions in the table.The judge of parallel table needs to depend on more rules,and these rules are in the detection of wireless strip table.In the process of Chinese characters recognition,Chinese characters are sent to the classifier to identify before the characters segmentation.Segmentation includes line segmentation and word segmentation,and it's goal is to get Chinese characters image.The accuracy of Chinese character segmentation directly affects the accuracy of recognition.If the segmentation result is error,it will inevitably cause errors in the subsequent identification,and affect the overall performance of the system.The features of our segmentation algorithm are multi threshold and multi segmentation strategy,using multiple threshold segmentation experiments respectively,according to the results of selection of the most suitable threshold,to reduce the influence of subjective threshold and one-time segmentation errors.The system chooses Android as environment.Document segmentation,detection and recognition of handwritten Chinese characters are parts of system.After test,the entire system meets with good results.
Keywords/Search Tags:handwritten document, complex layout analysis, table recognition, handwritten character segmentation, Android
PDF Full Text Request
Related items