Font Size: a A A

Study On Document Image Layout Understanding

Posted on:2003-07-11Degree:DoctorType:Dissertation
Country:ChinaCandidate:J S LiuFull Text:PDF
GTID:1118360092475149Subject:Instrument Science and Technology
Abstract/Summary:PDF Full Text Request
The digital information is the most precious resources in the mankind along with the coming of information age. More and more information is recorded on the paper is inconvenience of reserve,searches and data mining. The traditional way of input the information on the paper into computer by hand is unpractical. Along with the development of OCR (optical character recognition),some documents had been processed by computer automatically,It saves great amount of labor and money,and greatly improves the processing efficiency.Document processing includes two stages:document layout understanding and OCR recognition. The OCR is trend to apply because of decades of research,but document understanding is attached important attention in 90's,The deficiency of document understanding is restricted the application of document understanding. Based on plenty of papers,technology reports and dissertations,this dissertation makes some research works on the theories of document image understanding. It is focus on document image skew recognition,form layout understanding and Chinese font recognition (OFR). The main work is as follows:1. Research on the preprocessing of document image. The document understanding and OCR is sensitive to the skewed document image. This paper put forward a content-based document image skew angle estimate method to the various documents. The wavelet transform,run length smoothing and thinning method is used for abstract horizontal and vertical lines and text lines in the documents,the corresponding method is applied to different document images. And we apply error method to reduce the estimate errors. The experiments show that this method has high precession and adaptability.2. Apply the background information of the form to locate the geometric structure is a research hotspot of the document processing. We put forward the best coordinate-based method to extract the form structure. The horizontal and vertical lines and text lines are applied as locating mark,and location coordinates is created according to these marks. The full document image is divided into several areas,so the aberration is removed because of each area is very small. And the number of locating coordinates is deduced as some locating marks may be missing when noise exists in the document image,the form structure also can be extracted. This method has the ability of anti-jamming.3. The information about layout structure must be obtained by form learning before theform processing. Because only the handwritten character on the form is need to be processed,this paper put forward a automatic geometric structure learning method based on the typographic and handwritten text recognition (THR) to find out the handwritten area. And we applied the integrated method of the supervised cluster and support vector machine (SCSVM) to THR recognition. This supervised cluster is improved on,and the reject rules of supervised cluster are created. The SCSVM method can be commonly used on other recognition problems.4. Font research method is researched in this paper. Different Chinese font characters have the same strokes,the difference between the fonts exist in the detailed signals. Wavelet packet has good location both in space and frequency domains. We apply the texture character of wavelet packet of the font to THR recognition. And we put forward BP neural networks and Learning Subspace Method (LSM) to recognize the font. The strongpoint of BP network is high recognition precision,but the learning speed is very slow and the parameter needed to be adjusted by experience. The LSM has high learn speed without been intervention. The recognition system can collect the system reject sample used for re-study by transfer the LSM learning program. This integrated recognition system has the ability of re-learning. The recognition precession is improved thought re-learning when it in practical use.
Keywords/Search Tags:document image processing, layout understanding, skew adjustment, font recognition
PDF Full Text Request
Related items