Font Size: a A A

The Key Technology Research Of Document Image Layout Analysis

Posted on:2017-01-27Degree:MasterType:Thesis
Country:ChinaCandidate:C Y WeiFull Text:PDF
GTID:2348330482986779Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of information technology,network technology and communication technology,electronic document is the main carrier of information dissemination and sharing.The continuous expansion of information generated tremendous pressure on the storage of electronic documents,hence the need for these information layered compression.Stratification is the separation of text images into foreground and background layers,and then use different compression encoding technology.The layout analysis of text image is a very important step in the process of slicing,layout analysis includes layout segmentation and region recognition.In the process of text image acquisition,prone to image tilt and excess edge information,need to tilt correction and edge cutting,and then the layout analysis.For inclined text images,need to tilt correction.Commonly used detection methods Hough transform,but the method is of great computation.An improved Hough detection method is proposed in this paper.Firstly,the image is sampled by the zoom,reduce the number of pixels to be calculated.At the same time,the positive cosine value of Hough transform is pre-stored,reduce computation time.And then use the second Hough detection,reduce the detection range and reduce the angle increment.Not only can reduce the amount of computation,can also guarantee the detection accuracy.Experimental results show,the method is compared with the standard Hough detection method,the average computational efficiency can be increased by about 20 times.When taking pictures of paper documents,usually get extra edge information,need to remove this information.First,this paper presents a method based on the projection of the edge trim.The proposed method is segmented in multiple directions,statistical edge information and determining the boundary position,but the adaptability is weak if the edge information is more complex.Therefore,a new method based on contour is proposed.First,different regions contour extraction and calculates circumscribed rectangles,removal of partial edge information,then set a decision strategy to determine the boundary location.Experimental results show,this method has a strong adaptability to the edge information complex and the irregular situation.Layout segmentation is a very important step in the layout analysis,the text image is divided into many sub regions firstly,and then carry on the regional identification.Consider the efficiency advantage of top-down method,a new layout algorithm for fragment projection is proposed in this paper.the text image is divided into N column firstly,and then the horizontal and vertical projections of each column,through multiple projections to divide the text information into a number of sub regions.Experimental results show,this method inherits the characteristics of the projection method itself,at the same time,it can avoid the influence of the curvature of the image on the layout,and it also has a good adaptability to the complex text image.
Keywords/Search Tags:Document image, Skew detection, Edge clipping, Layout segmentation, Fragment projection method
PDF Full Text Request
Related items