Font Size: a A A

Studies In Self-adaptive Algorithms For Chinese Page Segmentation Technique Based On Complexity

Posted on:2011-07-21Degree:MasterType:Thesis
Country:ChinaCandidate:Y F FanFull Text:PDF
GTID:2198330332465190Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the development of electronic, computer,artificial intelligence, the demand was growing for all kinds of information paperless. More and more attention was paid to the auto processing of document image from scholars and experts. The auto processing of document image is composed of two steps, the first one is the document layout analysis and understanding, the second one is the recognition of OCR system. In the first step, the document image will be divided into many segments and then be classified into a few types, in order to input the area of character to OCR system.Based on the text, layout analysis for the main object of study for a complex document layout. The purpose of this study is to layout analysis, and through the document layout according to its complexity, will differentiate the layout of embedded images and the main page or each paragraph, title, In layout analysis of input, before the document for denoising processing and tilt correction.In the paper, the layout analysis algorithm complexity is searching for connected domain, combined with the layout and prior knowledge. In the words of the abnormal part, and the connected domain part. According to this part of the projection of the projection characteristics and connected domain shape.Analyzing this part is the graphics, form or document for the rest of the text, and the projection algorithm, part of the adaptive threshold set, each paragraph of the layout and the title, This calculation is lesser, algorithm with higher efficiency.
Keywords/Search Tags:Document Layout analysis, projection algorithm, the connected components, complexity
PDF Full Text Request
Related items